In cleaning up blog tags today, I realize that I never posted part III of this discussion. It's been posted at my MIT blog but for completeness I'm posting it here. It's decidedly more technical than most of what's here but may be interesting for those following the thread of building technology to implement law.
--------------------------------------
I've written in the last two blogs about how lawyers operate in a very structured enviroment. This will have a tremendous impact on what they'll consider acceptable in a user interface. They might accept something which seems a bit like an outline or a form, but years of experience tell me that they will rail at anything code-like.
For example, we see
:MList a rdf:List
and automatically read
"MList" is the name of a list written in rdf
Or,
air:pattern {
:MEMBER air:in :MEMBERLIST.
and know that we are asking our system to look for a pattern in the data in which a particular "member" is in a particular list of members. Perhaps because law is already learning to read, speak, and think in another language, most lawyers look at lines like those above and see no meaning.
Our current work-in-progress produces output that includes:
bjb reject bs non compliant with S9Policy 1
Because
phone record 2892 category HealthInformation
Justify
bs request instruction bs request content
type Request
bs request content intended beneficiary customer351
type Benefit Action Instruction
customer351 location MA
xphone record 2892 about customer351
Nearly every output item is a hotlink to something which provides definition, explanation, or derivation. Much of it is in "Tabulator", the cool tool that aggregates just the bits of data we want to know.
From a user-interface-for-lawyers perspective, this version of output is an improvement over our earlier ones because it removes a lot of things programmers do to solve computation challenges. It removes colons and semi-colons from places they're not commonly used in English (i.e., as the beginning of a term) and mostly uses words that are known in the general population. It also parses "humpbacks" - the programmers' traditional
concatenation of a string of words - back into separate words. And, it replaces hyphens and underlines - also used for concatenation - with blank spaces.
At last week's meeting, we talked about the possibility of generating output which simulates short English sentences. These might be stilted but would be most easily read by lawyers. Here's my first attempt at the top-level template:
Issue: Whether the transactions in [TransactionLogFilePopularName] {about [VariableName] [VariableValue]} comply with [MasterPolicyPopularName]?
Rule: To be compliant, [SubPolicyPopularName] of [MasterPolicyPopularName] requires [PatternVariableName] of an event to be [PatternValue1].
Fact: In transaction [TransactionNumber] [PatternVariableName] of the event was [PatternValue2].
Analysis: [PatternValue2] is not [PatternValue].
Conclusion: The transactions appear to be non-compliant with [SubPolicyName] of [MasterPolicyPopularName].
This seems to me approximately correct in the context of requests for the appliance to reason over millions of transactions with many sub-rules. A person seeking an answer from the system would create the Issue question. The Issue question is almost always going to ask whether some series of transactions violated a super-rule and often will have a scope limiter (e.g., in regards to a particular person or within a date scope or by one entity), denoted here by {}.
From the lawyer perspective, the interesting part of the result is the finding of non-compliance or possible non-compliance. So, the remainder of the output would be generated to describe only the failure(s) in a pattern-matching for one or more sub-rules. If there's more than one violation, the interface would display the Issue once and then the Rule to Conclusion steps for each non-compliant result.
I tried this out on a laywer I know. He insisted it was unintelligible when the []'s were left in but said it was manageable when he saw the same text without them.
For our Scenario 9, Transaction 15, an idealized top level display would say:
Issue: Whether the transactions in Xphone's Customer Service Log about Person Bob Same comply with MA Disability Discrimination Law?
Rule: To be compliant, Denial of Service Rule of MA Disability Discrimination Law requires reason of an event to be other than disability.
Fact: In transaction Xphone Record 2892 reason of the event was Infectious Disease.
Analysis: Infectious disease is not other than disability.
Conclusion: The transactions appear to be non-compliant with Denial of Service Rule of MA Disability Discrimination Law.
Each one of the bound values should have a hotlink to a Tabulator display that provides background or details.
Right now, we might be able to produce:
Issue: Whether the transactions in Xphone's Customer Service Log about Betty JB reject Bob Same comply with MA Disability Discrimination Law?
Rule: To be non-compliant, Denial of Service Rule of MA Disability Discrimination Law requires REASON of an event to be category Health Information.
Fact: In transaction Xphone Record 2892 REASON of the event was category Health Information.
Analysis: category Health Information is category Health Information.
Conclusion: The transactions appear to be non-compliant with Denial of Service Rule of MA Disability Discrimination Law.
This example highlights a few challenges.
1) It's possible that only failures of policies containing comparative matches (e.g., :v1 sameAs :v2; :v9 greaterThan :v3; :v12 withinDateRange :v4) are legally relevant. This needs more thought.
2) We'd need to name every sub-policy or have a default called UnnamedSubPolicy.
3) We'd need to be able to translate statute numbers to popular names and have a default instruction to include the statute number when no popular name exists.
4) We'd need some taxonomies (e.g., infectious disease is a sub-class of disability).
5) In a perfect world, we'd have some way to trigger a couple alternative displays. For example, it would be nice to be able to trigger one of two rule structures: either one that says a rule requires a match or one that says a rules requires a non-match. The reason for this is that if we always have to use the same structure, about half of the outputs will be very stilted and cause the lawyers to struggle to understand.
6) We need someway to deal with something the system can't reason. If the law requires the reason to be disability and the system doesn't know whether health information is the same as or different from disability, then it ought to be able to produce an analysis that says something along the lines of "The relationship between Health Information and disability is unknown" and produce a conclusion that says "Whether the transaction is compliant is unknown." If we're reasoning over millions of transactions there are likely to be quite a few of these and they ought to be presented after the non-compliant ones.