This blog was originally posted by me as part of the Breadcrumbs blog at MIT's Decentralized Information Group:
"Building accountability appliances involves a challenging intersection between business, law, and technology. In my first blog about how to satisfy the legal portion of the triad, I explained that - conceptually - the lawyer would want to know whether particular digital transactions had complied with one or more rules. Lawyers, used to having things their own way, want more... they want to get the answer to that question in a particular structure.
All legal cases are decided using the same structure. As first year law students, we spend a year with highlighter in hand, trying to pick out the pieces of that structure from within the torrent of words of court decisions. Over time, we become proficient and -- like the child who stops moving his lips when he reads -- the activity becomes internalized and instinctive. From then on, we only notice that something's not right by its absence.
The structure is as follows:
* ISSUE - the legal question that is being answered. Most typically it begins with the word "whether" "Whether the Privacy Act was violated?" Though the bigger question is whether an entire law was violated, because laws tend to have so many subparts and variables, we often frame a much narrower issue based upon a subpart that we think was violated, such as "Whether the computer matching prohibition of the Privacy Act was violated?"
* RULE - provides the words and the source of the legal requirement. This can be the statement of a particular law, such as "The US Copyright law permits unauthorized use of copyrighted work based upon four conditions - the nature of use, the the nature of the work, the amount of the work used, and the likely impact on the value of the work. 17 USC § 107." Or, it can be a rule created by a court to explain how the law is implemented in practical situations: "In our jurisdiction, there is no infringement of a copyrighted work when the original is distributed widely for free because there is no diminution of market value. Field v. Google, Inc., 412 F. Supp 2d. 1106 (D.Nev. 2006)." [Note: The explanation of the citation formats for the sources has filled books and blogs. Here's a good brief explanation from Cornell.]
* FACTS - the known or asserted facts that are relevant to the rule we are considering and the source of the information. In a Privacy Act computer matching case, there will be assertions like "the defendant's CIO admitted in deposition that he matched the deadbeat dads list against the welfare list and if there were matches he was to divert the benefits to the custodial parent." In a copyright case fair use case, a statement of facts might include "plaintiff admitted that he has posted the material on his website and has no limitations on access or copying the work."
* ANALYSIS - is where the facts are pattern-matched to the rule. "The rule does not permit US persons to lose benefits based upon computer matched data unless certain conditions are met. Our facts show that many people lost their welfare benefits after the deadbeat data was matched to the welfare rolls without any of the other conditions being met." Or "There can be no finding of copyright infringement where the original work was so widely distributed for free that it had no market value. Our facts show that Twinky Co. posted its original material on the web on its own site and every other site where it could gain access without any attempt to control copying or access."
* CONCLUSION - whether a violation has or has not occurred. "The computer matching provision of the Privacy Act was violated." or "The copyright was not infringed.
In light of this structure, we've been working on parsing the tremendous volume of words into their bare essentials so that they can be stored and computed to determine whether certain uses of data occurred in compliance with law. Most of our examples have focused on privacy.
Today, the number of sub-rules, elements of rules, and facts are often so voluminous that there is not enough time for a lawyer or team of lawyers to work through them all. So, the lawyer guesses what's likely to be a problem and works from there; the more experienced or talented the lawyer, the more likely that the guess leads to a productive result. Conversely, this likely means that many violations are never discovered. One of the great benefits of our proposed accountability appliance is that it could quickly reason over a massive volume of sub-rules, elements, and facts to identify the transactiions that appear to violate a rule or for which there's insufficient information to make a determination.
Although we haven't discussed it, I think there also will be a benefit to be derived from all of the reasoning that concludes that activities were compliant. I'm going to try to think of some high value examples.
Two additional blogs are coming:
Physically, what does the lawyer expect to see? At the simplest level, lawyers are expecting to see things in terms they recognize and without unfamiliar distractions; even the presence of things like curly brackets or metatags will cause most to insist that the output is unreadable. Because there is so much information, visualization tools present opportunities for presentations that will be intuitively understood.
The 1st Lawyer to Programmer/Programmer to Lawyer Dictionary! Compliance, auditing, privacy, and a host of other topics now have lawyers and system developers interacting regularly. As we've worked on DIG, I've noticed how the same words (e.g., rules, binding, fact) have different meanings."