Army computer science innovators have started peeling away the barriers to publicly releasing software from the military to the open source community. This means that users around the world will have the ability to see and modify the actual code of those government projects to suit their needs, and potentially pass those changes back to the Army if they prove useful.
The U.S. Army Research Laboratory (ARL) made its first open source release—Dshell, a network forensics analysis framework for security analysts to easily read and process network activity following an attack—nearly two years ago. The Dshell team faced extensive requirements related to liability, intellectual property, and operational security before its tool was posted to an online repository.
The challenge of publicly releasing computer code is not a new one, and not just for ARL, but across the federal government, as more agencies look into the best way to release open source software. The approval process and other requirements that the Dshell team endured have gradually formalized into the ARL Software Release Process for Unrestricted Public Release, announced in November 2016, just three months after the U.S. Chief Information Officer (CIO) Anthony Scott and the White House called for a greater release of custom code created by federal agencies in federal source code policy, M-16-21.
Before it was publicly released, Dshell had a small, informal community of users in several other government organizations. Analysts, inside ARL or out, could use the tool and customize it to find and parse the exact information they needed from network data, such as domain name lookups, reassembled website requests or decoded malware traffic.
According to Tracy Braun, a computer scientist in the Network Security Branch of ARL and the team lead for the Dshell project, the ability to customize the tool and quickly share the changes within its small community made it a good candidate for open sourcing to the wider scientific community.
ARL released Dshell to GitHub, one of many websites that hosts repositories for open source content, for two primary reasons. First, Dshell is a useful tool for keeping networks safe. By sharing it with the world, more security teams gain another specialized tool to keep their networks secure. It improves ARL local security by improving the security of the Internet as a whole. The second is common to all open source software: to improve the quality of the tool by increasing the number of skilled eyes looking for bugs and potential improvements throughout the code.
GitHub was chosen for Dshell because it allows members to easily download software code and store edits they make, and provides a mechanism to offer feedback to the original software authors.
The Dshell team is aware of the risks of putting security-related government code into the wild. However, the benefits, in many cases, outweigh the risks. The Dshell team decided that providing the means for good actors to review the code and identify any weaknesses exploitable by bad actors is of greater value than attempting to keep it secure through obscurity.
Users can create copies of Dshell and do what they want with it. ARL, in this case, or the host organization of any open source release, has no control over the copies. This is a lot like sending someone a favorite recipe. You cannot stand over his or her shoulder to make sure the recipe is followed to your exact specifications. However, if savvy cooks make improvements to the recipe, they can be passed to you the next time you meet, making your version of the recipe better. The same is true with open source. If others in the community make improvements to the code, they can easily share them with the development team to incorporate into the official version. And that is just what happened.
As of June 2016, users have created more than 11,000 copies of the Dshell tool and have offered 62 suggested modifications. The shared modifications, formally named “pull requests,” do exactly what was hoped. Community members found and fixed bugs that the Dshell team missed, and even added new features that improve ARL’s ability to detect malicious actors. Additionally, rolling the enhancements into the official version makes it easier to share the software across organizations. Instead of emailing files or sending CDs, collaborators can be pointed to the GitHub page to download the latest updates.
Some agencies, like NASA, adopted open source early. In 2014, NASA released more than 1,000 of its projects in one mass distribution. Others—like the National Security Agency, the National Guard and the Air Force Research Laboratory—joined more recently.
The most all-inclusive DOD guidance for open source software came from the DOD CIO in 2009. The memorandum addressed a popular misconception that open source software is forbidden by the DOD Information Assurance Policy.
Cem Karan, a computer engineer at ARL working to develop ARL’s formal open source process, described the more realistic hurdles for releasing Dshell and other ARL project. “As an individual, open sourcing software means simply adding a user name and an email address, and then uploading or downloading software as I wish. Conversely, if I publish on behalf of ARL, I have a lot more to consider,” he said. That, he continued, includes “legal concerns with issues like trademark, copyright and patent law. For instance, open source code is generally released under a standard license that relies on copyright as an enforcement mechanism, but federal government works do not have copyright protection.”
The problem is that such licenses govern limits of liability and warranty, how intellectual property can be used and shared. Without a license (or with a license that was declared invalid), releasing software may open the door to significant litigation against the government, and against anyone that uses or contributes to government open source projects, Karan said.
“The White House has published a very high level policy,” Karan said. “It will be up to individual agencies how to implement it.”
Open source DOD projects remain few. Though military organizations differ, there are three major reasons why more projects have not yet been released as open source—visibility, operational security and paperwork.
Releasing software to the world means just that: the world can see it. There is a certain amount of fear, even within the Dshell team, that releasing software could decrease its effectiveness because others will know how it works and how to avoid it. The risk has to be weighed against the benefits.
Open source is also not necessarily always a good fit for Army projects. Obviously, anything classified is precluded. Even something unclassified may not be releasable if it ties back to close-hold methodologies.
Publicizing software also requires a level of responsibility. Once a project is released, a community will form around it, and the community will expect a certain amount of feedback. It will expect answers to questions, responses to concerns and regular updates.
When deciding on software to release, Karan said, “it will take scrutiny of each project as we go forward into a new level of transparency.”
The U.S. CIO’s push for change makes it easier for organizations like ARL to realize the benefits of open source software with the new policy. Dshell’s full-on jump into the ocean of open source projects helped find the path and potential pitfalls in the release process, and that should help future projects have an easier time with a public release. ARL is looking for more ways to open public access that is both meaningful to the Army and beneficial to the software community.
Karan has coordinated ARL’s open source policy for publication to GitHub in hopes that other agencies would copy, use and change the document in a way that allows ARL to easily incorporate any feedback. The posting also allows other agencies to use ARL’s policy as a starting point for their own open source initiatives. “There is no point for everybody building from scratch, which is part of the reason for open source,” Karan explained.
The U.S. CIO said that, in the coming months, he expects the launch of the new website Code.gov, which ARL will use alongside other options to share computer code to support basic and applied research. For a research laboratory, releasing projects to the open source community provides an easy way to share the code and methods used in published papers, making external peer review simpler.
Karan described a recent experience with one of his projects. “I have a simulator project that showed amazing results—that is, until I found the bug that was making it so perfect. Once I fixed that bug, it went back to what you would actually expect. If I had published that paper and had the code out there, experts could debate the results and find the glitch. I would have had to retract the paper in that case, but I would have improved the science.”
Releasing research projects to the open source community provides wider visibility of computational expertise at the lab. It encourages openness and sharing in a constructive way that can potentially improve projects and processes.
“If we have projects that get traction, that’s a big success,” Karan added. “More importantly, about putting the code out, is that it helps us to improve the science.”
For more information about Dshell or about the ARL software release process for unrestricted public release, go to https://github.com/USArmyResearchLab.
Article originally posted on ASC.ARMY.MIL.