Skip to comments.Google engineer admits "strong indication that it is likely" he copied Sun code into Android
Posted on 09/08/2011 12:27:54 PM PDT by Swordmaker
This is the second of three consecutive posts on information gleaned from last night's filings in Oracle v. Google. I previously blogged about Sun's proposal to create a Red Hat-style Android distribution with open source Java.
While patents are the most important part of Oracle's lawsuit against Google, the copyright infringement part shouldn't be underestimated. Google is currently trying to get rid of it on summary judgment, but Oracle defends its related claims.
Judge Alsup denied the filing of various interesting documents under seal, so they entered the public record last night. Also, documents that were heavily-redacted are now much less redacted or even unredacted.
One of Oracle's allegations of direct copying of code from the Java codebase into Android is a function that is in the Arrays class of Java and resurfaced in the Android TimSort class. The following image (click to enlarge) shows a line-by-line comparison of the two code segments -- it's fair to say they're simply identical:
Google said (in its motion for summary judgment on the copyright infringement claims) that "[t]hose nine lines (which are the same in both of the Android files) implement a mundane utility function". So rather than denying that there was copying in play, Google argues that it should be allowed to copy such small amounts of code.
But in connection with other infringement allegations, Google claimed (in its reply to Oracle's amended complaint last yar) that there was "Third Party Liability":
"Any use in the Android Platform of any protected elements of the works that are the subject of the Asserted Copyrights was made by third parties without the knowledge of Google, and Google is not liable for such use."
Against that background, it's quite interesting to see that the copied code segment shown above was created by a Google engineer -- not a third party -- and the said engineer (Joshua Bloch, who previously worked for Sun and whose title at Google used to be, or still is, Chief Java Architect) even admitted in a deposition that he likely had access to the original Sun code when he wrote that code segment:
"Q. BY MR. JACOBS: Do you have a recollection of accessing Sun code while you were working on TimSort?
A. I don't have a recollection, but I'm perfectly willing to believe that I did. You know, I think the similarity of the signature, the fact that, you know, the three arguments are in the same order and have the same name, you know, is a strong indication that it is likely that I did."
Asked about the motivation for so much similarity between the original code and his code, he testified:
"[...] the only functionality that it shares is this little function here, and it is very much in the interest of the users of the new sort that it behave exactly like the old sort. You want it to throw exactly the same exception. You want it to actually emit the same prose. You want that text to be the same.
So, you know, it's the one where it makes sense to [have similar signatures]."
And in an earlier part of the deposition, he talked about how he wrote the source file at issue, TimSort:
Q. And you write to Steve: "PS, I am currently working on a drop-in replacement for Harmony's sort function, which has demonstrated a huge up to 20X performance improvements on G1 hardware. This will be my first contribution to Android."
Do you see that?
A. I do.
Q. What is the -- what's the name of that drop-in replacement?
Q. And what was the project that you were doing that led to your developing this TimSort drop-in replacement?
A. I undertook it personally a couple years earlier after a conversation with Guido van Rossum. TimSort is Python's system sort. He told me about it. He said it's really fast. I said, oh, wow, I wonder if we can make it work in the Java programming language and contribute it to all the platforms.
Q. And so when you say you undertook it personally, in what way did it fit into your job duties?
A. My job duties are moderately flexible, and if I see something that, you know, I think could be beneficial to Google and to the broader Java ecosystem, and my manager doesn't object, I do it.
So much for third-party liability.
Also, another part of the deposition clarifies that this wasn't just Python code that Oracle obtained from a different original right holder:
Q. The RangeCheck function that we're looking at on page Google 551599 did not come from Tim Peters' list sort for Python; correct?
A. C does not have exceptions. So it most certainly did not.
Here's another interesting excerpt from his deposition. Look how he tries to avoid admitting that he's still Google's Chief Java Architect:
Q. And so what is your -- you are still employed by Google as of today?
A. I am.
Q. And what titles do you have at Google?
A. I am a -- I believe they call it senior staff engineer. I don't -- I don't -- or senior staff software engineer, and I use the courtesy title of chief Java architect occasionally.
Q. Then you continue to serve as Google's representative at the [Java Community Process]; correct?
Q. And then -- and you have a role at the Open Source Programs Office up through today; is that correct?
Q. And at some point you were a quote, member, unquote of the Android team?
A. That is correct.
Google engineer admits: API design is a creative activity
Another deposition is interesting with a view to the debate over the copyrightability of APIs. If an API is creative expression, that's a strong argument in favor of copyright protection. Here's what Bob Lee, a Google engineer who developed among other things the award-winning Guice framework, said about this:
Q. Would you say that designing APIs is a creative activity?
[objection to form, by Google lawyer]
THE WITNESS: Yes, absolutely.
According to a now-unredacted part of one of its pleadings, Oracle also elicited a telling statement on APIs from Joshua Bloch: "[I]f someone else were to take this prose and publish it for profit, Sun would probably be upset, and with good reason."
Obviously it's not up to one or two programmers to decide whether APIs are creative. But these statements are definitely useful to Oracle as it claims copyright protection for its Java APIs. News on the "test files" controversy
News on the "test files" controversy
The new filings also provide some interesting information on the question of whether certain Java classes that were apparently decompiled and then incorporated into the Android codebase were just test files, and what it would mean legally if they were.
Back in January I published seven such files and then rebutted claims that I had overstated their significance and showed that some (but not all) source availability packages of Android device makers included those files at the time.
Oracle pointed out (which was previously known) that "[t]hese files were not contained in the 'test' area on Oracle's directory" and says that even if they may only have been used as test files by Google, "test files are still valuable". And here's a previously-redacted statement about the commercial value of test files:
"Google paid a software engineering firm $900,000 to develop a software test suite for the Java core libraries."
Noser is the name of the firm that performed that work and possibly some other work for Google. There might be some third-party liability in this context, but even then Google would be responsible for distributing infringing code, even if Google could try to recover the related costs from a third party if there's a contractual basis for that. And Joshua Bloch's testimony shows that Google can't simply attribute all of those infringement issues to unnamed others.
If you want on or off the Mac Ping List, Freepmail me.
Of interest perhaps.
I’m not a coder, but it seems that in these object languages, there are only so many ways to do things.
My company does not allow ANY copying of ANY source code without legal review. This is why. One little coder that uses someone else’s source code, even just a few lines, can cause major headaches. INtellectual Property rights is nothing to fool around with. A company can find someone else owns the profits if they do.
As noted by another poster, there are a limited number of ways to code a solution in a given language. Experienced coders will often come up with the same or similar results. Variations in white space, layout, variable names and comments would be expected. The "side by side" example looks like a clear "cut and paste" kind of coding where source code to the necessary solution was available and the snippet was copied to save typing. It's a common practice, but most folks are not laboring under concerns that the end product will end up under scrutiny in a court room.
(1) This example is incredibly trivial. Any decent programmer that was given the task to code this routine would come up with something very similar.
This is like someone quoting the first line of an entire book, and then being brought up on plagarism. “It was a dark and story night.”
(2) None the less, it was lifted. The variable names are exactly the same, a very unlikely occurance.
It's been a long time since that's been true.
The original implementation of C++ called CFront, written by the creator of C++ itself, was just a preprocessor that translated C++ code into C. But as C++ grew in complexity (specifically when they added exception support!) C++ compilers went native.
Software is a real can of worms..
There are only so many ways to skin the cat, so to speak.
No sense in rediscovering that which is already tried and true..
but then, what would patent lawyers have to do?
Let a jury decide? Yeah, that’s the ticket.
but recall that C++ ultimately gets compiled down to C then to assembly.
not necessarily so. maybe other compilers do, but not MS.
.Net languages compile to IL to be parsed by the JIT assembler emitting native code. Non .Net is compiled
directly to native.
Developers at MS are fully aware to never introduce Open Source into MS product code. Its a serious offence. Not sure what other corps do along these lines.
Software is an implementation of mathematical and logic-based algorithms, therefore not patentable.
The whole subject is a waste and only about making easy money.
When asked to write an interface for an employer, I had some latitude and instead, in the same timeframe, a few weeks, wrote a generic interface system, which enabled me to build interfaces after that with no coding, just by designing file layouts using an interface-layout building program, which took only a few minutes instead of the few weeks of design, code and test for writing a typical interface. I then left the company and went back into consulting.
Now, I would be stealing if I took a copy of that interface system and provided it to another client or employer in the future. But I still remember how I did it, so I could easily do it again, though it would also take a few weeks to design, code and test it. In that case, the new client is paying for my experience and capabilities, and they are paying me to build the same type of system for them, though it would not be identical, because I don’t have photographic memory. Even if I did have photographic memory, and essentially rebuilt the system exactly, including an exact copy of the source code of the programs, I still would be retyping the whole thing, rebuilding from scratch and testing the new system. Now, one could successfully argue that the photographic-memory copy is violating copyright law - it would be, since the copyright of a work for hire is owned by the employer. However, for a brand new design, where I can’t remember exactly what I did, but I’m making another generic interface system from scratch, I would be using the same algorithmic approach - the same math and logic - but I would be using all different names for variables and functions (sections of code). The algorithms - being math and logic - can’t be copyrighted or patented without making it impossible for anyone in the U.S. to write a generic interface system for 17 years or have to pay my ex-employer royalties. Which is utter nonsense.
The big deal is copyright - if my next client and I cook up a scheme where I copy everything I originally did before I leave my employer, then walk in to the client Monday morning, load up my system and run it in 5 minutes - that’s stealing the work I did for my former employer. The new client is too cheap to pay me for a few weeks work - or if they pay me a few weeks worth of time for just those 5 minutes, then I’m too lazy to do the work and simply want to get paid for a few weeks and not work.
Legally and ethically, it’s real straightforward: pay to have work done instead of stealing from other companies. It works out best for all involved.
And let’s not go on allowing patents to be issued for business processes of the workflow required to operate the Accounts Payable department (how can that be enforced ?) or the fastest logic for flipping bits on a screen (which is simply the product of the physics of the device and logic).
If one knows how to run a software company, proprietary bodies of work can be reasonably protected with a few basic - and legal and ethical - tactics enough so that market share and success can be had. Patents merely allow discoveries to be withheld from the marketplace by big businesses that either can’t figure out how to commercialize them or perhaps aren’t motivated enough to do so. Big business is infamous for being inept at commercializing innovative ideas.
The author of the article needs to have every claim treated with a huge bag of salt. He is a hack for Oracle.
That being said - example 1) You can’t copyright an API! Oracle can with that all it wants - but it is a loosing battle. It’s been through the courts before, and doesn’t wash.
An API is a specification on how to talk to a functional unit. It is NOT the functional unit itself. Implementation of that API is the functional piece of software that can be copyrighted.
This is right up there with SCO trying to claim copyright ownership of error.h from the C world. It didn’t work for SCO and won’t work for Oracle.
Further - there is real power in Google’s argument about being a trivial piece of code out of millions of lines of code. They are admitting that there are something like 9 places where they may have copied a few lines of code (this example given is of that magnitude.) 9 places constituting around maybe 100 lines of code out of multiple millions makes it trivial. No way Oracle can call that a derivative work.
Lastly -as others have pointed out. There are many places where there is only one reasonable way to code something. Implementing an API puts you in a further straight-jacket.
Ignore anything that Florian Mueller publishes -because it is going to contain a very slanted point of view. I would suggest following Groklaw.net if you want to see a more accurate discussion of the blow-by-blow of the Google/Oracle lawsuit. They’ll also educate you on what the law is and what is important.
Sad to say, I could never make sense out of C syntax and structure.
You and me. Sadder to say, this too clever by half programming language has been the basis of just about every programming language that came after it, perl, Java and others that have tried to improve on the unimprovable.
I'm well aware the backends have gone to a more native appoach. GNU compilers emit RTL. I'm not real surprised to hear the MS .Net compilers are putting out code compliant with that backend.
If you think of C as the closest thing to a “platform independent assembly language” as can exist, it starts to make more sense.
I can certainly see that, but it's the nuts-and-bolts, practical side of understanding the structure, the grammar and the maddening, seemingly cryptic application of indentation that makes me crazy. T-SQL, BASIC, PASCAL and other database programming languages were never that hard to understand. Boy - did date myself with that.
With C you can indent any way you want. The BNF grammar for C considers tabs, spaces, and newlines as whitespace.
As a coder (mainly PHP), I usually get my information from various sites that offer code as a sample. The coder from Google could have gotten it from there.. I don’t know.
Joshua Bloch isn't your every day run of the mill java coder.
Sorry judge, it was only a little bit! Honest!......I only shot him once - with a 22!.......I only grabbed her hooters a little bit! Bwaaahhhhaa....
It is true that there are only so many ways to implement some code functions but this is just plain ass laziness. Of course it was copied. Of course someone else's work was stolen without credit.
I have to disagree. I read and post articles from both. Florian Mueller has been spot on and pretty even handed. He publishes articles following the law as written, not as people wish it were in an idealized world. His viewpoint does fall more toward support of Free and Open Source Software licensing, and that's where his field of expertise lies. He also understands European laws far better than Groklaw, which is probably better in general law issues.
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.