The open source software development process will be going under the microscope as computer science researchers from the University of California Davis use a US$750,000 grant from the National Science Foundation to find out how systems such as the Apache Web server, PostgreSQL database and Python scripting language are built.
Their research will focus on how open source software projects avoid typical departmental or individual slowdowns and manage to produce quality code quicker than commercial, proprietary models.
“The belief in the open source software community is that open access to the source turns on all the available brain power, full blast, on every problem, challenge or opportunity,” lead researcher and UC Davis Computer Science Professor Premkuma Devanbu told LinuxInsider. “In traditional products, bits of code tend to be owned or controlled by specific individuals, and thus each bit of code can be on a single-threaded critical path. In open source, anyone can read and comment on a file.”
Social Software Study
Successful open source software projects manage to merge social structure and software structure effectively and avoid conventional problems associated with collaborative projects, including pacing by the slowest contributor, Devanbu indicated. Researchers will gain insights into some of these projects by combing message boards, bug reports and e-mail discussions.
“We’re not sure yet — we think that there will generally be a convergence of social structure modularity and artifact structure modularity, but it’s too early to offer any definite results,” Devanbu said. “Prior work by Carliss Baldwin and others comes at it from a different perspective. They argue, using the case of Mozilla, that modularity increases volunteerism. Our goal is to validate these beliefs quantitatively, and it’s too early to say for sure.” A review of the results is expected within six to eight weeks.
Lesson of Linux
Quality tends to improve with the open source software approach because of modularity, Devanbu noted, which allows a division of labor and knowledge.
“Thus, good design allows implementation to proceed with maximum parallelism and minimum synchronization and coordination,” he said.
Poor modularity gives rise to social interaction problems, Devanbu observed, and in turn increases pressure for more modularization, as was the case with Linux.
“As Linux lore has it, when [Linux creator Linus] Torvalds was slow in getting changes to the kernel, the resulting dissatisfaction and flame battles led to the modularization of the kernel and the appointment of lieutenants to oversee each module, which in turn improved the efficiency of the social structure underlying the Linux community of practice,” he explained.
It’s About the Code
The UC Davis researchers said they would also delve into how open source software projects are able to avoid being paced by the slowest contributors.
One reason may be that open source contributors are brought together by the software itself, as opposed to a job or money, said Josh Berkus, core team member of the PostgreSQL open source database project, which released a beta of its latest version, 8.2, this week.
“It’s possible that the software design reciprocally influencing the character of the project participants is something different about open source — mostly because, unlike a company, the code is what holds the community together, not paychecks,” Berkus told LinuxInsider.
Secret to Success: No Meetings
How are open source software projects able to set their speed and quality on the best participants? That’s simple: “No meetings,” Berkus said.
“I’m serious,” he continued. “In a large proprietary software development environment, engineers spend four to nine hours a week in meetings, where they are given assignments by managers and expected to work on only their assigned project for the next week. Areas of responsibility are carved out carefully and elaborate quality control and review processes are enforced. The result of all this is to pace the engineers to the plodding pace of management, so that they can stay in control of the project.”
Another reason open source development moves more quickly is that engineers are on the projects they want to work on, limiting procrastination and “sandbagging,” said Berkus.
Lastly, Berkus explained that open source developers are less apt to work on incorrect or buggy code since the project is their own.
“Open source projects are less likely to follow ‘wrong’ specifications, because the same people who write the code are the ones setting the goals,” he noted.