Samizdat is a series of excerpts from an upcoming book on open source and operating systems that will be published later this year. AdTI did not publish Samizdat with the expectation that rabidly pro-Linux developers would embrace it. Its purpose is to provide U.S. leadership with a researched presentation on attribution and intellectual property problems with the hybrid source code model, particularly Linux. It is my hope that leadership would find this document helpful with public policy decisions regarding its future investment in Linux and other hybrid source products.
The United States is the home of the United States Patent and Trademark Office, an internationally respected agency which contributes to the worldwide effort to protect and govern intellectual property. In addition, the U.S. government is one of the largest patent holders in the world, owning the rights to 20-30,000 patents. Annually, the U.S. government also contributes billions to hi-tech research and development because research and development supports our intellectual property economy. Therefore, it is in the U.S. government’s best interest to fully understand the impact of Linux on the intellectual property foundation of our country, as well as the entire information technology (IT) sector.
True Open Source vs. Hybrid Source
The Samizdat report recommends that the U.S. government should invest $5 billion in research and development efforts that produce true open source products, such as BSD and MIT license-based open source. Government investment in open source development will accelerate innovation. However, increased investment should be in true open source, open source without any stipulations, other than attribution and copyright notification, not hybrid source.
"Hybrid source code" is a phrase coined by former Tocqueville Chairman Gregory Fossedal. The term refers to products that mix open-source code with proprietary code; or which is compiled using procedures that do not diligently protect against such mixing.
While hybrid software appears to be the same as open source, it isn’t. Hybrid source code can never be true intellectual property. The actual purpose of hybrid source is to nullify its value as private property, which makes the hybrid source model significantly different from true open source. Noone can ever truly accrue any value from owning hybrid source software, because everybody (and anybody) has the rights to every line of improvement in it. Worse, many argue that if hybrid source is used the wrong way, it can make other source code hybrid source as well.
The hybrid source model negatively impacts the intellectual property model for all software, and inevitably the entire IT economy. As long as the value of the IT economy is dependent on the preservation of intellectual property, it is counterproductive for the U.S. government to invest in Linux.
Linux is Inherently Unstable
The disturbing reality is that the hybrid source model depends heavily upon sponging talent from U.S. corporations and/or U.S. proprietary software. Much of this questionable borrowing is a) not in the best interest U.S. corporations b) not in the best interest of IT workers in America c) at a serious expense to the investment community, an entity betting on the success of intellectual property in the marketplace.
Linux is a leprosy; and is having a deleterious effect on the U.S. IT industry because it is steadily depreciating the value of the software industry sector. Software is also embedded in hardware, chips, printers and even consumer electronics. Should embedded software become 'free' too, it would be natural to conclude the value of hardware will spiral downward as well.
In Samizdat, I argue that the inherent instability of hybrid source development such as Linux is due in great part to its inability to provide a sound policy for originating source code without attribution or IP problems. Within two days of the release of Samizdat, OSDL(1) member Linus Torvalds affirmed my concerns, announcing that Linux kernel contributions depend largely on "trust." In an attempt to fix the system, Linus Torvalds announced a new (but still ambiguous) policy(2) to promote better "trust."
Samizdat concludes that the root of attribution, IP misappropriation, and acknowledgement problems in Linux is ---in fact--- the trust model. Basically, Torvalds and other Linux advocates are admitting to using a ‘three monkeys’ policy for software development: see no evil, speak no evil, hear no evil. Specifically, Torvalds and the Linux kernel management team accept blind source code contributions. Then, they ask for a certification. But the certification does not hold the contributor, the Linux community, or Torvalds legally accountable. Nor does it guarantee that the source is produced in a 'clean room'. Meanwhile users are left to just 'trust' Linux too, legally left to face the ramifications of any significant legal problems. This is a 'wishful thinking' policy, and is not a sound approach for software development. The reality is that, noone, including Linus Torvalds, can ever guarantee that code in the Linux kernel is free of counter ownership, or attribution claims. I suggest that the U.S. government buy and invest in software from a confirmable entity, not from an assortment of unconfirmable sources. I am certain that inevitably, some unfortunate user of Linux will be facing an incalculable legal problem.
Meanwhile, we should also very plainly ask, “who[m] are we trusting?”
In a controversial section of Samizdat, I ask readers to pose some very hard questions about the origin of the Linux kernel. This is for a number of reasons, but especially because the same people that are selling the trust model cannot answer basic questions about what attribution, acknowledgement, and IP credit they may have owed ATT Corporation and/or Prentice Hall Corporation in 1991 when the Linux kernel was introduced. The same community that sells ‘trust’, is the same community that celebrates: the theft of ATT Unix source code in the late 70’s, joked about the theft of Windows source code in February, and commenting on the Cisco source code theft in May wrote in Newsforge, “maybe the theft will be a good enough reason for Cisco customers to check out open source alternatives….(3)”
Isn’t fair to question the character and ethics of individuals and movements that espouse contempt for intellectual property? Isn’t fair to question their character, when the core of their business strategy is trust?
Interviews for Samizdat
"... He says Linus couldn't possibly have written that much code," said Tanenbaum. "But there's tremendous variation from programmer to programmer -- some research I saw says maybe as high as 30 to 1 for great programmers and poor ones -- and Linus could easily be in the top 10 percent or top 1 percent of all programmers...."
Lisa Stapleton, Linux Insider, May 21, 2004
Tanenbaum and I agree on one point: the Linux kernel is an incredible, but conspicuous accomplishment. No one seemed to be interested in critiquing it. So subsequently, I decided to look into this, because we agreed it was no average feat. I (with help from staff and associates) collected evidence and looked at it a dozen different ways. Afterwards, I humbly concluded that the story in the public record about Torvalds and the Linux kernel is questionable. AdTI was kind enough to publish some of my findings, so readers could analyze the story for themselves.
To write Samizdat, I worked with (and quoted) many individuals directly or indirectly familiar with Linux development. I will continue to interview people within the open source profession about open source. It would be skewed and bias to only quote people that are anti-Linux or anti-open source. I have done this for years, and will continue to do so, regardless of what a source thinks of my theories.
As many are aware, I interviewed Professor Tanenbaum, the author of Minix, a copyright protected property by Prentice Hall. On March 8, 2004, Professor Tanenbaum sent me the following e-mail:
“MINIX was the base that Linus used to create Linux. He also took many ideas from MINIX, including the file system, source tree, and much more.(4)”
I met with Professor Tanenbaum not to write a treatise on software engineering, but to discuss the issue of software product rights and protection that he brought up in his email. In an interview with Tanenbaum, it became immediately noticeable that the professor was an animated, but tense individual about the topic of rights and attribution. He felt that well-known facts about Minix/Linux development should not have to be questioned. It was clear that he was very conflicted, and probably sorry that he sent the email in the first place.
Ironically, Professor Tanenbaum's recent comments only recapitulate many of the substantive contradictions regarding the early Linux kernel AdTI decided to discuss in Samizdat. I met with Professor Tanenbaum with the hope of resolving some of these inconsistent and contradictory accounts in the public record.
Is it likely that a student (Linus Torvalds) with no operating systems experience, a non-Unix licensee, without any use of Minix or Unix source code, could build a functioning kernel in six months -- whereas it took you (Tanenbaum) three years to build Minix?
In Tanenbaum’s recent posts(5), he argues (as he told me) that there are "others" that have created Unix clones or operating systems within the same constraints. Tanenbaum’s argumentation only increased my doubt about the Torvalds story because the comparisons were too unbelievable. To accept Tanenbaum’s argument, one must assumme that Linus Torvalds, at 21, with one year of C programming, was Doug Comer, an accomplished computer scientist, or smarter than the Coherent team, and of course a better programmer than the good professor too.
Tanenbaum told me about the Coherent project repeatedly, but it was easy to research that it was a completely different situation. It wasn’t a solo effort, it was a team. Second, the timeline was wrong. Tanenbaum told me it took two years, then corrected himself on his own website writing it took six years. Either way, it wasn’t six months. On his website, it seems now Tanenbaum is comparing the inventors of Unix, Dennis Ritchie, and Kenneth Thompson to Torvalds.
This comparison if anything should demonstrate why I was not very convinced by the professor. Both Ritchie and Thompson had exceptional familiarity with MULTICS-- and then wrote UNIX from scratch. Completely different from Linus, who says he started with nothing and had no experience. Another reason this is interesting is because the Ritchie, Thompson kernel was 11,000 lines of code over a number of years, and the Torvalds kernel was 32,000 in under a year.
Another problem with Tanenbaum’s logic is that he only presents examples of people that were Unix licensees, had Unix source code, or who were exceptionally familiar with software development. He cannot provide one example reasonably comparable to the Torvalds case.
Why do accounts continually assert that Torvalds "wrote Linux from scratch"?
Presumably, Professor Tanenbaum was not in Linus Torvalds's apartment at the time Linux was, to use a phrase recently (but only recently) disclaimed by Torvalds, "invented." Yet Tanenbaum vehemently insists that Torvalds wrote Linux from scratch, which means from a blank computer screen to most people. No books, no resources, no notes -- certainly not a line of source code to borrow from, or to be tempted to borrow from. But in a number of interviews and correspondence I conducted with individuals about operating system development, almost everyone reported that it is highly unlikely that even a pure genius could start from a blank computer screen and write the early Linux kernel.
But suppose he could. Would he?
In fact, everyone reported to me the opposite, that it only makes perfect sense to start with someone’s code, or framework, which is the common practice among programmers.
Furthermore in almost every interview with experienced computer science professionals, almost all said that they personally had a copy of the Lions' notes, an illegal distribution of Unix source code. Even Tanenbaum admits to teaching from the Lions' notes. Linus says he started with nothing. In a recent ZDNet interview(6), he denies having the Lions' notes. This is also unbelievable to AdTI. The story is too amazing----everybody that I met knew Linus intimately enough to confirm he wrote the kernel from scratch--- had an illegal copy of the Lions notes---- but Torvalds, was never---even near the Lions notes.
Meanwhile, an associate of mine asked Richard Stallman, who started with the Mach Kernel, why his GNU team could not build a kernel as fast as Torvalds. Mr. Stallman provided a credible, believable set of reasons why building a kernel was not a simple task. I thank Mr. Stallman for his forthrightness and honesty. We included this interview to provide another perspective for readers to understand the magnitude of the Torvalds story. To accept the accepted story of the origins of Linux, Torvalds would also have been light years ahead of a team that built the very compiler he needed to make the kernel work.
The GNU team contributed their GCC compiler, a complicated product with over 110,000 lines of code to the Linux project. Without the compiler, it is very likely that the Linux project would not have succeeded. The GNU team only asked that the product be called GNU/Linux, a very simple request for helping to make him famous. But Torvalds silently, but deliberately let the naming idea die.
If Linux was based on Minix, doesn’t it owe rights, attribution to Prentice Hall? Does it owe attribution or rights to anyone else?
How much "inspiration" or "code and ideas" (Eric Raymond) did Linus get from Minix? I argue clearly enough to credit the Prentice Hall product. Nor is an occasional tip of the hat in conversation sufficient; rather, I am talking about an attribution within the copyright and/or the credits files of the kernel. Quite noticeably, however, there is not one acknowledgement of Minix anywhere in the Linux kernel. If a typical college student were to hand in a term paper exhibiting this level of academic probity, wouldn't she or he be reprimanded?
Almost daily, we receive new contradictions from people on this point. In a published interview between Eric Raymond and Linus Torvalds, Raymond brandishes how Torvalds basically derived Linux from Minix. But recently in a ZDNet interview last month, Torvalds insisted that he didn't start with Minix, but did get ideas from Unix(7).
What is anybody suppose to believe?
The larger issue is that Minix was a copyrighted product, for academic use only. The Minix license insisted from 1987 to 2000 that any commercial use of Minix for any reason, required permission of Prentice Hall. The Linux kernel was released in Fall 1991, well within the Prentice Hall proprietary license period. On the point of the license issue, Tanenbaum would just nervously repeat that he succeeded in getting Prentice Hall to change the license to BSD, so the topic was irrelevant. AdTI asks readers to ask why? Why did the license issue matter to Tanenbaum?
Tanenbaum insists that we are wrong to bring any of this up, but ironically, he comments on his site, “…but Linus' sloppiness about attribution is no reason to assert that Linus didn't write Linux(8).” AdTI is not suggesting that readers believe that Prentice Hall is going to sue. The point of the paper is to magnify potential problems associated with this type of software development. AdTI insists that development such as this is an accident waiting to happen; something that will seriously impact both Linux users and developers. For example, in the case of Minix/Linux, AdTI argues that hypothetically, a copyright infringement case could easily erupt, if someone was determined to prove that Linux was an unauthorized derivative product of Minix.
The final reason why AdTI decided to focus on this issue is because we learned that in fact, Prentice Hall took all of this very seriously and had previously sued a programmer for unauthorized development of Minix.
Follow Up With Torvalds
AdTI contacted Torvalds employer OSDL to interview him for clarification. Without any facts, Tanenbaum goes as far to post that AdTI did not try to contact Linus, but this is contradicted by the attached post. The OSDL contact person tells AdTI that if Linus doesn’t get back to us, he is not interested in being interviewed. AdTI has no problem publishing a report, whether sources do, or do not want to talk with us.
For years, Linus is credited with being an inventor. AdTI argued the claim was false. Coincidently in a recent interview, Linus decided he was not the inventor of Linux commenting in a ZDNet story, "I'd agree that 'inventor' is not necessarily the right word…(9)"
AdTI publishes its work for all audiences. It is written so that even if a group of elementary school children asked Tanenbaum the same questions AdTI did, they would see the very contradictions we reported.
Vrije University is a very cool place. AdTI encourages anyone that spends any time in Amsterdam to visit. At the good professor’s recommendation, AdTI spent a number of hours talking with Vrije university computer science faculty. They were great fun and extremely helpful. For that, we are also very grateful.
Professor Tanenbaum did not convince AdTI that Linus Torvalds wrote the Linux kernel from scratch. We are sorry if this has caused any inconvenience to Professor Tanenbaum or anyone else.
There is far too much boasting about stealing, reverse engineering, and illegal copying espoused by some within the open source community. If the theft of the Lions notes had not become such a banner waving incident, our research team probably would have never been inspired to write Samizdat. The purpose of Samizdat is to demonstrate how and why the hybrid model encourages these types of activities.
AdTI argues the best way to solve this problem is to create a more substantive pool of true, free open source code. For example, Vrije University would be an excellent candidate for research and development dollars to produce more open source. To this day, Linux is siphoning resources from proprietary software companies. Encouraging this activity would be a significant mistake for the U.S. government.
Unix is one of the greatest achievements in the history of computer science. Like other great inventions, the existence of a robust intellectual property model enabled Unix investors, developers, and users to reap significant rewards. We should support both invention and innovation. However, building a product that starts with the accomplishment of others and announcing it as completely your own work product, is not invention, nor is it innovation. Innovation can only work properly if innovators properly credit the work of others, especially if the innovator has decided to introduce the product into the marketplace for commercial gain. Nevertheless, AdTI concludes that U.S. Government investment in true open source development would significantly bolster the IT industry sector; and conversely, investment in hybrid open source will deteriorate it.
Kenneth Brown is president of the Alexis de Tocqueville Institution and director of its technology research programs. He is the author of numerous research papers and popular articles on technology issues, including the 2002 report, "Opening the open-source debate," one of the first papers to raise serious questions about the security of open- and hybrid-source computer software, a point recently raised by the president of Symantec Corporation. He is reportedly "not the sharpest knife in the drawer," but nevertheless is able to converse with many intelligent people, and is accepted at fine restaurants and hotels around the world.
1. Open Source Development Laboratory, www.osdl.org
2. Under the enhanced kernel submission process, contributions to the Linux kernel may only be made by individuals who acknowledge their right to make the contribution under an appropriate open-source license. The acknowledgement, called the Developer's Certificate of Origin (DCO), tracks contributions and contributors. http://www.technewsworld.com/story/33961.html
3. http://trends.newsforge.com/trends/04/05/17/1932214.shtmlCommentary: If only Cisco code had been open source, May 17, 2004
4. Tanenbaum, Andrew. Interview with AdTI. March 8, 2004.