Transparent Accountable Data Mining Initiative - UPDATE

Posted by K Krasnow Waterman on Wed, Apr 26, 2006 @ 22:04 PM
The pace is picking up at CSAIL!

We've posted lots of code, some summaries, and the next scenarios we will work on.

If you haven't looked at the project yet, the summary is here.

A summary of the work and links to the details are here.

Scenario 3 - is a fictional scenario in whch information is collected by the Transportation Security Administration about airline passengers, being matched to possible terrorist names, and ultimately being forwarded to the FBI. 

In this scenario, the system checks to see if TSA's actions are appropriate under the Privacy Act.

That law requires agencies to publish a "System of Records Notice" (called a SORN) stating the source, type, and authorized putpose for acquiring data.  The law also requires agencies to publish the terms and conditions -- called "Routine Uses" -- under which they will disseminate information.

We now have code:
  • that represents a series of transactions (in various forms, including using NIEM, N3, and RDF)
  • that represents some of the rules in the Privacy Act;
  • that checks to see if the data TSA collects is permitted under the SORN; and
  • that checks to see if the dissemination from TSA to the FBI matches the requirements in the Routine Uses published by the TSA..

Scenario 4 - expands the fictional scenario 3 to have a longer chain of actions. 

The data is disseminated by the FBI to the US Marshals, who then approach an Assistant United States Attorney, who in turn presents the matter to a Court in order to obtain a warrant.

This will give us a chance to see if a recipient agency's conditions for receiving data (stated in its SORN) is consistent with the providing agency's rules for dissemination (stated in its Routine Uses).

Scenario 5 - expands the scenario to consider case law, the decisions of courts regarding disputes of interpretation.

Scenario 6 - is a different fictional scenario involving a web-crawler that aggregates data about individuals.

(Scenarios 5 & 6 should be posted by tomorrow.)

Although our long-term goal is proof generation, we've learned a lot about what it takes to build rules-based reasoning for law in general and for information acquisition and dissemination by the government in specific.  I've promised to write a paper on the lessons learned, so tune in for that soon (projected to be finished by June 1).


Topics: technology innovation, technology implementing law