Thursday, November 27, 2008

The REAL problem with biometrically verifiable ID cards

There has been a recent push towards ID cards for everyone in the UK. There is a huge backlash against this idea, partly because of recent incidents which tell us that the government cannot be really trusted to keep the data safe. But a far more serious problem is that biometric ID cards are unforgeable and can be used by even less trustworthy people, Islamic terrorists for instance, to target specific people:
http://www.pbs.org/wnet/wideangle/episodes/iraqi-exodus/violence-targets-iraqs-christians/3456/
On Sunday, Farques Batool, a Christian who owned a music store, was gunned down and killed in his shop. His teenage nephew was also wounded. A pharmacist was killed Friday by a man who pretended to be an undercover police officer, asked to the pharmacist’s identification card, then shot him. Religion is listed on government-issued ID cards in Iraq.
Verifiable identity is a grave security risk, never mind all the useful side effects, such as being able to get your tax refund without actually filling in forms, based on unique ID as primary key into different data bases.

Tuesday, May 06, 2008

Virus vs worm

What exactly is the difference between a virus and a worm? Bruce Davie in his 6.829 class last week said security was one of those things that you truly dont know unless you know all of it. He also said it was difficult for him to get a clear definition for virus and worm. The final one he settled on, was that a virus, like a biological virus, does not actually have the code for replication and requires a host. Whereas, a worm has a life of its own, i.e. it has the code to replicate itself. No user action is required to replicate a worm.

Saturday, March 08, 2008

Looeyville, Kentucky

The 8th floor of Stata center has "True" architecture people as well as the Advanced Network Architecture group. I just went to the restroom and found the following posted on the two loo stalls: "Out of Order" and "Super Scalar".

Both in the same handwriting.

Tuesday, January 01, 2008

Inconsistencies in the Reality Mining dataset?

The MIT Reality Mining dataset has become a touchstone for DTN and social network type research, and I am exploring using that for some of my own simulations. My intended use is to use it as a source trace for when the experiment subjects were in bluetooth proximity with each other.

Imagine my surprise, therefore, when I ran the following SQL query, for average duration of bluetooth proximity, and found that this average was a negative number!

I chose the start and end dates to coincide with a period when Media Lab is frantically preparing for their annual showcase to Industry sponsors. Nathan Eagle, whose PhD consisted of gathering and mining this data set talks about this time in his thesis.

mysql> select avg(endtime-starttime) from (select ds.starttime,ds.endtime,ds.person_oid as src,d.person_oid as dst from devicespan ds left outer join device d on ds.device_oid=d.oid where d.person_oid != 0 and ds.starttime >'2004-10-01' and ds.endtime < '2004-11-01' order by ds.starttime) as xxx;
+------------------------+
| avg(endtime-starttime) |
+------------------------+
| -948.28116818633 |
+------------------------+
1 row in set (0.00 sec)


It turns out that this is because there are 4 records which is erroneous in this period, including one eggregious mistake, which looks like a typo (a date of 7/10 rather than 10/7):
mysql> select * from (select ds.starttime,ds.endtime,ds.person_oid as src,d.person_oid as dst from devicespan ds left outer join device d on ds.device_oid=d.oid where d.person_oid != 0 and ds.starttime >'2004-10-01' and ds.endtime < '2004-11-01' order by ds.starttime) as xxx where endtime < starttime;
+---------------------+---------------------+-----+------+
| starttime | endtime | src | dst |
+---------------------+---------------------+-----+------+
| 2004-10-07 15:51:34 | 2004-07-10 16:02:55 | 92 | 36 |
| 2004-10-31 02:02:02 | 2004-10-31 01:23:40 | 46 | 86 |
| 2004-10-31 02:12:46 | 2004-10-31 01:23:40 | 46 | 22 |
| 2004-10-31 02:18:12 | 2004-10-31 01:44:40 | 46 | 73 |
+---------------------+---------------------+-----+------+


Now, if we exclude these records, we get:
mysql> select avg(endtime-starttime) from (select ds.starttime,ds.endtime,ds.person_oid as src,d.person_oid as dst from devicespan ds left outer join device d on ds.device_oid=d.oid where d.person_oid != 0 and ds.starttime >'2004-10-01' and ds.endtime < '2004-11-01' order by ds.starttime) as xxx where endtime >=starttime;
+------------------------+
| avg(endtime-starttime) |
+------------------------+
| 12460.984108352 |
+------------------------+

There are a total of 22154 records in this period, so if you add in the effect of one mistake (bolded above), averaged over each record, you get roughly 12460+948=13405:
mysql> select (timestamp('2004-07-10 16:02:55')-timestamp('2004-10-07 15:51:34'))/22154;
+---------------------------------------------------------------------------+
| (timestamp('2004-07-10 16:02:55')-timestamp('2004-10-07 15:51:34'))/22154 |
+---------------------------------------------------------------------------+
| -13405.9257470434 |
+---------------------------------------------------------------------------+

(The other three records contribute 3 seconds to the wrong average).