Monday, November 30, 2015
CHILD PORNOGRAPHY, THE INTERNET, AND THE CHALLENGE OF UPDATING STATUTORY TERMS http://harvardlawreview.org/
CHILD PORNOGRAPHY, THE INTERNET, AND THE CHALLENGE OF UPDATING STATUTORY TERMS http://harvardlawreview.org/
4 Out of 5 College Kids Sext By Belinda Luscombe
Sexting and college, they go together like carnal and knowledge. But a recent survey from the University of Rhode Island has put some numbers on how widespread it is. And the answer is:
w i d e.
Seventy-eight percent of students in the survey say they’ve received sexually suggestive messages and 56% say they have received intimate images. Two thirds of the students said they sent salacious messages. Before we call a national epidemic of Weiner -itis, we should note that 73% of texts were sent to a romantic partner. Almost like a romantic old love-letter, but shorter and with more emoticons.
And, it seems, with more staying power. Almost a fifth of the people who received the racy messages then forwarded them onto somebody else. And 10% of all the explicit messages sent were relayed without permission from the original author. (Those statistics should be put on a label and stuck on cellphones everywhere.)
“At the age of most college students, people are filtering through relationships at a faster rate,” said one of the study authors Tiffani Kisler. “People want to feel a sense of belonging so they are sharing more of themselves with people they are still getting to know. Once they click the send button, they don’t know where else a message will end up.”
The study sample was small (200) and limited to Rhode Island students, but the issue of younger people sending explicit images and messages via cell-phone is increasingly worrying. There have been several high-profile cases recently in which a forwarded sext has made life misery for the original composer of the message. It has also left those forwarding the message facing child pornography charges.
The Blurry Boundaries of Child Porn Not every illicit image is equally offensive, by Jesse Walker
There are several reasons why child pornography isn’t governed by the laxer rules that regulate adult porn, but one rationale lies at the core of the law: Child porn is a record of a crime. It is illegal for a grown man to have sex with, say, a 4-year-old boy, and he doesn’t get to claim the protection of the First Amendment just because someone was photographing him while he committed the act. When jihadists behead their captives on video, the tape doesn’t change the fact that they’re murderers. It demonstrates and advertises it. And the photographer is an accessory to the crime.
Unfortunately, people have been arraigned for images that are a far cry from a man raping a toddler. Besides the “sexting” incidents, in which teenagers are charged with producing porn after flirtatiously emailing photos of themselves to each other, there are cases like the one involving 28-year-old Todd Senters, who videotaped himself having sex with his 17-year-old girlfriend. This happened in Nebraska, where the age of consent is 16, so the intercourse itself was legal. But under the federal Child Protection Act of 1984, a person in a pornographic performance is considered a child if he or she is under 18. So the tape was illicit even though the act was not—not a record of a crime, but still a crime itself. The Nebraska Supreme Court upheld Senters’ conviction in 2005.
If the definition of child porn is broader than it needs to be, the great bulk of enforcement, at the federal level at least, is aimed at material most liberal-minded Americans would consider criminal. That represents a profound legal and social shift. It’s hard to believe, but the law didn’t distinguish child pornography from other sorts of porn until the 1970s, when greater tolerance for sexually explicit material made the seediest stuff more visible, prompting a backlash. Even then, the sole federal law on the subject—the Protection of Children Against Sexual Exploitation Act, passed in 1978—merely stiffened the penalties for material that would already be illegal under the obscenity statutes. And no item was obscene if it had “serious literary, artistic, political, or scientific value.” It wasn’t until New York v. Ferber in 1982 that the U.S. Supreme Court allowed legislators to outlaw images of children that didn’t fit the strict legal definition of obscenity. Child abuse is child abuse, the unanimous court declared, even if the pictures of that abuse have artistic merit.
Since then, police have mostly aimed enforcement not at the producers of the porn but at the distributors and, more controversially, the consumers. There are three central reasons why the law pursues people who possess child porn as well as those who make and sell it:
1. To eliminate the market. In Ferber, the Court noted that while “the production of pornographic materials is a low-profile, clandestine industry,” distribution networks must be “visible” to be effective; therefore “the most expeditious if not the only practical method of law enforcement may be to dry up the market for this material.” If the market is disrupted, that reduces the incentives to create the images in the first place.
This is the most understandable rationale for the restriction. If you pay for a kiddie porn tape, you’re not just looking at images. You’re creating an incentive to make more of them and therefore to abuse more children.
In the Web era, you might be doing more than that. In 2006 The New York Times described PlayToy, a website that offers “scores of original photographs of scantily clad under-age children...often posed in ways requested by subscribers.” That isn’t the only online community that solicits such input from its users, and it surely isn’t the worst of them; at least the PlayToy models weren’t being molested onscreen. Media scholars have described an “active audience” that reframes, reinterprets, and even rewrites its favorite texts. This is the active audience on steroids, the ugly underside of the user-driven Web. In the worst-case scenario, such consumers are co-conspirators, morally if not legally, in the rape and abuse of children.
2. To police people’s thoughts. The Child Pornography Protection Act of 1996 banned even computer-generated “virtual” porn that was produced without any actual kids. Such material, the law declared, “encourages a societal perception of children as sexual objects.” In other words, the measure targeted not the acts performed in front of a camera but the acts performed within people’s minds.
The Supreme Court struck down that law as overbroad in 2002, but the rationale behind it has been a constant companion to the crackdown on child porn. An argument that most researchers roundly reject when the topic is adult pornography—that viewing it incites people to commit sexual violence—is frequently cited uncritically where porn featuring children is concerned. The Supreme Court may be wary of laws that invoke a “paternalistic interest in regulating [the] mind,” as Justice Byron White once put it, but it’s clear that many activists and legislators do not share those qualms.
If that were all there was to the issue, the civil libertarian position would be fairly clear. Restrictions on purchasing child porn might be justifiable, but restrictions on merely possessing it—acquiring it for free via Kazaa, say—would not. (If the idea is to cut into child pornographers’ profits, peer-to-peer sharing might be more ally than enemy.) But there are other arguments, including one in particular that’s worth some thought:
3. To protect the privacy of the victims. “Because the child’s actions are reduced to a recording,” the attorney David P. Shouvlin wrote in the Wake Forest Law Review in 1981, “the pornography may haunt him in future years, long after the original misdeed took place. A child who has posed for a camera must go through life knowing that the recording is circulating within the mass distribution system for child pornography.” Those words evidently impressed the Supreme Court, which quoted them a year later in Ferber. The viewing, in this analysis, is itself a perpetuation of the abuse.
Such arguments undergird Masha’s Law, named for Masha Allen, a Russian orphan who was held prisoner, raped repeatedly on camera, and advertised in the kiddie porn world as “Disney World Girl.” The measure, which became law in 2006, allows adults who were victimized by pornographers as minors to sue people who download the resulting images.
Emotionally, it’s a compelling concept. And where invasion of privacy is the concern, civil remedies certainly make more sense than criminal prosecutions. But the idea opens a can of worms. If the issue is privacy, shame, and being haunted by ineradicable images, wouldn’t the same argument apply to the abused prisoners photographed at Abu Ghraib? To hostages filmed by their captors and aired on the news? To anyone humiliated in front of a camera? Should an inadvertent Internet celebrity, deeply embarrassed that people are chuckling at a clip of his light-saber dance, have standing to sue the viewers?
That last example might seem absurd, but it actually veers close to the pornography debate. Because the child porn laws set the age of maturity so high, they cover not just the victims of coercion but exhibitionists who voluntarily put photographs of themselves online. There also are people who post pictures that are salacious but don’t include the “lascivious exhibition of the genitals or pubic area” invoked by the law. They do not necessarily intend for anyone but their friends to see the photos. But the Internet doesn’t always work that way.
Consider Amanda Wenk, a teenager who became an online celebrity in 2005 after she posted pictures of herself in bikinis, tight T-shirts, and low-cut dresses on her Webshots site. She took down the photos after they attracted outside attention, but by that time they had escaped to Fark and other forums for people who like to swap online ephemera.
She wasn’t really a child at the time, but the law says she was; the images aren’t much more pornographic than a high school yearbook, but some people clearly use them as though they were Playboy centerfolds. She is presumably embarrassed by the attention, given that she tried to remove the pictures from the Web. She may well be haunted by it. Is it the role of the government to preserve her peace of mind?
The difference between what happened to Amanda Wenk and what happened to Masha Allen should be obvious. But both must, to borrow the phrase the Supreme Court quoted in Ferber, “go through life knowing that the recording is circulating within the mass distribution system for child pornography.” I’m not convinced that’s reason enough to punish the people who merely see those recordings, as opposed to the people who actively participate in the abuse of prisoners like Allen—or the inmates at Abu Ghraib.
Child porn evidence unreliable: study of Playboy
By Ivan Oransky
By Ivan Oransky
NEW YORK (Reuters Health) - A commonly used method of judging a woman's sexual maturation may not be good enough in child pornography prosecutions.
That, at least, is what a group of pediatric endocrinologists concluded from a study of more than 500 Playboy centerfolds.
"So often these people get convicted on what I refer to as felonious bad taste," said Dr. Arlan Rosenbloom, a pediatric endocrinologist at the University of Florida in Gainesville. "They're downloading stuff that isn't very nice, but isn't illegal."
In many cases, prosecutors are basing their cases against people who have downloaded images, and don't have a way to confirm the subjects' ages.
In a study published online today in the journal Pediatrics, Rosenbloom and his colleagues write that they were prompted to examine "547 images with breast exposure from an anthology of the monthly centerfold illustrations in Playboy magazine from December 1953 to December 2007" because Rosenbloom had seen the so-called Tanner scale misused in trials.
The scale, published in 1969 by Dr. James Tanner and a colleague, describes five stages in the development of male and female sexual characteristics such as the shape of breasts and presence of pubic hair. Stage 5 is referred to as "mature," leading to some confusion over whether all women over 18 were in that stage.
The authors chose Playboy centerfolds because publisher Hugh Hefner is known for being scrupulous about only having models over 18. (The study may be the only one in the scientific literature to cite 2007's "Playboy; The Complete Centerfolds.")
When four pediatric endocrinologists looked at the centerfolds, however, at least one of them thought more than a quarter were Tanner stage 4. That kind of evidence, argued Rosenbloom, who has testified on behalf of defendants in such cases, has been used to bring people to trial and even convict them, based on the false idea that all women over 18 were Tanner stage 5.
"I thought it was important to let the pediatric public know that this is not an appropriate distinction," Rosenbloom, who has a related paper coming out soon in the International Journal of Legal Medicine, told Reuters Health.
‘WHOLLY ILLEGITIMATE USE'
Tanner himself - who died in 2010 - acknowledged in a 1998 letter co-authored with Rosenbloom that judging a person's age based on the stages that bear his name is a "wholly illegitimate use."
"Physical definitions of ‘child' are inherently problematic," Richard Wortley, a professor at University College London and co-author of "Internet child pornography: Causes, investigation and prevention," told Reuters Health by email. A recent court case, for example, "ruled that pictures that 'appear to be' of a minor was too imprecise, and that chronological age should be the criterion. Internationally most legislation is based on chronological age."
Yaman Akdeniz, a law professor at Istanbul Bilgi University and the author of "Internet Child Pornography and the Law," said most child pornography cases involve images of "younger children under the age of 12 rather than older teenagers." In such cases, distinguishing between Tanner stages 4 and 5 would be irrelevant.
However, he told Reuters Health by email, "courts deal with some contested cases in which the age of the person appearing in such images is questioned." How often the Tanner scale is used, however, is "very difficult to say," he said, noting that it should not be the only evidence to determine guilt.
The paper makes a good case against using Tanner stages in child pornography cases, according to Chuck Kleinhans, of Northwestern University's School of Communication in Evanston, Illinois.
But Kleinhans, who has studied child pornography issues, noted in an email to Reuters Health that the Playboy Playmates are "carefully groomed and have full body makeup before they go in front of the cameras." They're also "very carefully posed and lit," and "famously retouched (airbrushed) and modified before publication."
"Given contemporary Photoshopping technology, centerfolds are completely different from clinical medical photography, especially for forensic purposes," he said.
Rosenbloom acknowledged that the centerfolds were hardly the perfect set of data. He noted, however, that if anything, given that they tended to be well-endowed, they biased the sample against his claim, strengthening the conclusions.
The study, after all, looked at how often an "expert" might consider breasts to be at stage 4, he said; it was not designed to determine how many women had breasts at that stage. "For that you would need to have clinical grade photos or at best, examination of at least 100 young women, not an easy task."
E-DISCOVERY & DIGITAL EVIDENCE MISSISSIPPI COLLEGE OF LAW E-DISCOVERY CONFERENCE 2014 by CRAIG BALL
Possession of Child Pornography: Should You be Convicted When the Computer Cache Does the Saving for You? by Giannina Marin
Possession of Child Pornography: Should You be Convicted When the Computer Cache Does the Saving for You? by Giannina Marin
Disentangling Child Pornography from Child Sex Abuse, by Carissa Byrne Hessick
The Unbelievable Truth About Child Pornography in the UK by John Carr
The Unbelievable Truth About Child Pornography in the UK by John Carr
The Unbelievable Truth About Child Pornography in the UK by John Carr
Child Pornography Guidelines Are Ripe for Challenge BY ALAN ELLIS AND KAREN L. LANDAU
Deconstructing the Myth of Careful Study: A Primer on the Flawed Progression of the Child Pornography Guidelines by Troy Stabenow
Deconstructing the Myth of Careful Study: A Primer on the Flawed Progression of the Child Pornography Guidelines by Troy Stabenow
Saturday, November 28, 2015
Don't believe the hype
Despite being the safest and healthiest humans to have lived, we allow 'experts' to scare us witless, says Dan Gardner.
'Recent figures suggest some 50,000 pedophiles are prowling the internet at any one time," says the website of Innocence in Danger, a non-government organisation based in Switzerland. No source is cited for the claim, which appears under the headline "Some terryfying [sic] statistics".
It is indeed a terrifying statistic. It is also well-travelled. It has been cited in Britain, Canada, the US, and points beyond. Like a new strain of the flu, it has spread from newspaper articles to TV reports to public speakers, websites, blogs, and countless conversations of frightened parents. It even infected Alberto Gonzales, the former US attorney-general.
Unfortunately, the mere fact that a number has proliferated, even at the highest levels of officialdom, does not demonstrate the number is true.
There's one obvious reason to be at least a little suspicious. It's a round number. A very round number. It's not 47,000 or 53,500. It's 50,000. And 50,000 is just the sort of perfectly round number people pluck out of the air when they make a wild guess.
And what method aside from wild guessing could one use to come up with the number of pedophiles online? Accurate counts of ordinary internet users are tough enough. But pedophiles? Much as one may wish they were all identified and registered with the authorities, they aren't, and they aren't likely to be completely frank about their inclinations when a phone surveyor calls to ask about online sexual habits.
Another reason for caution is the way this alleged fact changes from one telling to another. Britain's Independent states there are "as many as" 50,000 pedophiles online. Other sources say there are precisely 50,000. A few claim "at least" 50,000.
There's also variation in what those pedophiles are supposed to be up to. In some stories, the pedophiles are merely "online" and the reader is left to assume they are doing something other than getting the latest headlines or paying the water bill. Others say the pedophiles are "looking for children".
In the most precise account, all 50,000 pedophiles are said to have "one goal in mind: to find a child, strike up a relationship and eventually meet with the child". This spectacular feat of mind-reading can be found on the website of Spectorsoft, a company that sells frightened parents software that monitors their children's online activities for the low cost of $US99.95 ($105).
Then there is the supposed arena in which those 50,000 pedophiles are said to be operating. In some versions, it's 50,000 around the world, or on the whole of the internet. But an American blogger narrowed that considerably: "50,000 pedophiles at any one time are on MySpace.com and other social networking sites looking for kids".
And a story in the magazine Dallas Child quotes two parent-activists - identified as "California's Parents of the Year for 2001" - who say, "The internet is a wonderful tool, but it can also be an evil one, especially sites like MySpace.com. At any one given time, 50,000 pedophiles are on the site."
All this should have our inner sceptic ringing alarm bells. But there is a final, critical question to be answered before we can dismiss this number as junk: what is its source? In most of the number's appearances, no source is cited. The author simply uses the passive voice ("It is estimated that … ") to paper over this gaping hole. Another way to achieve the same effect - one used far too often in newspapers - is to simply quote an official who states the number as fact.
The number then takes on the credibility of the official even though the reader still doesn't know the number's source. After an article in the Ottawa Citizen repeated the 50,000 pedophiles figure within a quotation from Ian Wilms, the president of the Canadian Association of Police Boards, I called Wilms and asked where he got the number. It came up in a conversation with British police, he said. And no, he couldn't be more precise.
Fortunately, there are several versions of the "50,000 pedophiles" story - including the article in The Independent - that do point to a source. They all say it comes from the Federal Bureau of Investigation. So I called the FBI. No, a spokeswoman said, that's not our number. We have no idea where it came from. And no, she said, the bureau doesn't have its own estimate of the number of pedophiles online because that's impossible to figure out.
Scepticism is rarely enough to finish off a dubious but useful number, however.
In April 2006, the then US attorney-general, Alberto Gonzales, told the National Centre for Missing and Exploited Children: "It is simply astonishing how many predators there are … at any given time, 50,000 predators are on the internet prowling for children." The source of this figure, Gonzales said, was "the television program Dateline".
Gonzales should listen to National Public Radio more often. When journalists from the broadcaster asked Dateline to explain where it got this number, they were told by the show's Chris Hansen that it had interviewed an expert and asked him whether this number that "keeps surfacing" is accurate.
The expert replied, as paraphrased by Hansen: "I've heard it, but depending on how you define what is a predator, it could actually be a very low estimate." Dateline took this as confirmation the number was accurate and repeated it as unqualified fact on three different shows.
The expert Dateline spoke to was an FBI agent, Ken Lanning. When NPR asked Lanning about the magic number, he said: "I didn't know where it came from. I couldn't confirm it, but I couldn't refute it, but I felt it was a fairly reasonable figure."
Lanning also noted a curious coincidence: 50,000 has made appearances as a key number in at least two previous panics in recent years. In the early 1980s, it was supposed to be the number of children kidnapped by strangers every year. At the end of the decade, it was the number of murders committed by satanic cults. These claims, widely reported and believed at the time, were later revealed to be nothing more than hysterical guesses that became "fact" in the retelling.
Now it may be that, as Lanning thinks, the 50,000 figure is close to the reality. But it may also be way off the mark. There may be five million pedophiles on the internet at any given moment, or 500, or five. Nobody really knows. This number is, at best, a guess made by persons unknown.
To get a number that matches the sort of pedophile-in-the-shadow attacks that terrify parents, NISMART (National Incidence Studies of Missing, Abducted, Runaway, and Thrownaway Children) created a category called stereotypical kidnappings: a stranger or slight acquaintance takes or detains a child overnight, transports the child more than 35 kilometres, holds the child for ransom or with the intention of keeping him or her, or kills the child. NISMART estimated that in one year the total number of stereotypical kidnappings in the US was 115. If that number is adjusted to include only children younger than 14 when they were kidnapped it is 90. To look at these statistics rationally, we have to remember that there are roughly 70 million American children. With just 115 cases of children under 18 being stolen by strangers, the risk to any one American minor is about 0.00016 per cent, or 1 in 608,696. For those 14 and under the numbers are only slightly different. There are roughly 59 million Americans aged 14 and under, so the risk is 0.00015 per cent. That's 1 in 655,555.
To put that in perspective, consider the swimming pool. In 2003, the total number of American children 14 and younger who drowned in a swimming pool was 285. Thus the chance of a child drowning in a swimming pool is 1 in 245,614 - or more than 2.5 times greater than the chance of a child being abducted by a stranger. Also in 2003, 2408 children 14 and younger were killed in car crashes. That makes the probability of such a death 1 in 29,070. Thus, a child is 26 times more likely to die in a car crash than to be abducted by a stranger.
The numbers vary from country to country, but everywhere the likelihood of a child being snatched by a stranger is almost indescribably tiny. In Britain, a Home Office report states: "There were 59 cases involving a stranger successfully abducting a child or children, resulting in 68 victims." With 11.4 million children under 16, that works out to a risk of 1 in 167,647. (Note that the British and American numbers are based on different definitions and calculation methods; they aren't directly comparable.)
In Canada, Marlene Dalley of the National Missing Children Services carefully combed police data banks for the years 2000 and 2001 and discovered the total number of cases in which a child was abducted by a "stranger" - using a definition that included "neighbour" or "friend of the father" - was five. As for abductions by true strangers, there was precisely one in two years. There are roughly 2.9 million children aged 14 or younger in Canada. Thus the annual risk to one of those children is 1 in 5.8 million.
As to how these terrible cases end, the statistics flashed briefly by CNN were almost accurate. According to NISMART's rounded numbers (hence they don't quite add up to 100 per cent), 57 per cent of children abducted by strangers in a stereotypical kidnapping were returned alive, while 40 per cent were killed. Four per cent were not found. One critical fact not mentioned in the show is that nine out of 10 stranger abductions are resolved within 24 hours.
All these numbers boil down to something quite simple. First, the overwhelming majority of minors are not abducted. Second, the overwhelming majority of minors who are abducted are not taken by strangers. Third, the overwhelming majority of minors abducted by strangers are not taken in circumstances resembling the stereotypical kidnapping that so terrifies parents. Fourth, the number of stereotypical kidnappings is so small that the chance of that happening to a child is almost indescribably tiny. And finally, in the incredibly unlikely event that a child is snatched by a lurking pedophile, there is a good chance the child will survive and return home in less than a day.
An edited extract from Risk: The Science and Politics of Fear (Scribe, $35) by Dan Gardner. Published next Saturday.
Tuesday, November 24, 2015
Sunday, November 22, 2015
Thursday, November 19, 2015
Online Predators—Myth versus Reality By Janis Wolak, J.D., with the assistance of Lindsey Evans, Stephanie Nguyen, and Denise A. Hines, Ph.D.
Online Predators—Myth versus Reality By Janis Wolak, J.D., with the assistance of Lindsey Evans, Stephanie Nguyen, and Denise A. Hines, Ph.D.
Wednesday, November 18, 2015
Hash Value Tool (Or “Digital Fingerprint”) Increasingly Noted In Cases Involving Electronic Evidence
Hash Value Tool (Or “Digital Fingerprint”) Increasingly Noted In Cases Involving Electronic Evidence
Over the past few years, an increasing number of cases have discussed the role of "hash values" (mathematical algorithms) used to identify electronic images, records, files or other evidence; hash values (commonly referred to as "digital fingerprints") have unique identification capabilities that have a high degree of accuracy to confirm whether two records or files are a match or are dissimilar, such as in United States v. Cartier, 543 F.3d 442, 444 (8th Cir. 2008) (No. 07-3222) ("Every digital image or file has a hash value, which is a string of numbers and letters that serves to identify the image or file.") (footnote omitted)
As we have previously noted, “hash” values are an important tool used to identify and authenticate digital evidence. See Using “Hash” Values In Handling Electronic Evidence; see also Hash Values Used To Confirm Seized Video Clips And Images; Federal Judicial Center, Managing Discovery of Electronic Information: A Pocket Guide for Judges, at 24 (2007) (“‘Hashing’ is used to guarantee the authenticity of an original data set and can be used as a digital equivalent of the Bates stamp used in paper document production.”) (quoted in Lorraine v. Markel American Ins. Co., 241 F.R.D. 534, 546-47 & n.23 (D. Md. 2007) ("Hash values can be inserted into original electronic documents when they are created to provide them with distinctive characteristics that will permit their authentication under Rule 901(b)(4).")).
A recent review of cases referring to the use of hash values highlights the growing acceptance of this tool on forensic issues involving electronic evidence. As summarized below, hash values are commonly referred to as "digital fingerprints" or "digital DNA" and have been described as having more than a 99 percent level of accuracy to confirm two files or records match.
Using Hash Values
A “hash value” is an algorithm that can be used to confirm that two digital files or objects are either the same or different. As the Fourth Circuit recently summarized:
A "hash value" is an alphanumeric string that serves to identify an individual digital file as a kind of "digital fingerprint." Although it may be possible for two digital files to have hash values that "collide," or overlap, it is unlikely that the values of two dissimilar images will do so. United States v. Cartier, 543 F.3d 442, 446 (8th Cir. 2008) (No. 07-3222). In the present case, the district court found that files with the same hash value have a 99.99 percent probability of being identical.
United States v. Wellman, 663 F.3d 224, 226 n.2 (4th Cir. 2011) (No. 10-4689) (identifying suspected child pornography by hash values), cert. denied, 132 S.Ct. 1945, 182 L.Ed.2d 800 (2012); see also United States v. Farlow, 681 F.3d 15, 19 n.2 (1st Cir. 2012) (No. 11-1975) (defining hash value as “a short, unique set of numbers and letters produced by running the complex strings of data that make up a computer file through a mathematical algorithm”); United States v. Henderson, 595 F.3d 1198, 1199 n.2 (10th Cir. 2010) (No. 09-8015) (“A SHA value of a computer file is, so far as science can ascertain presently, unique. No two computer files with different content have ever had the same SHA value.”) (quoting United States v. Klynsma, No. CR 08-50145-RHB, 2009 WL 3147790, at *6 (D.S.D. Sept. 29, 2009)).
Generally, there are two common types of hash values that are used:
Secure Hash Algorithm Version 1 (or SHA-1): As some cases have noted, “SHA-1 stands for Secure Hash Algorithm Version 1 — a digital fingerprint of a computer file. It is a 32-digit number that is calculated for a file and unique to it.” United States v. Glassgow, 682 F.3d 1107, 1110 n.2 (8th Cir. 2012) (No. 11-2611); see also United States v. Miknevich, 638 F.3d 178, 181 n.1 (3rd Cir. 2011) (No. 09-3059) (“A SHA1 (or SHA-1) value is a mathematical algorithm that stands for Secured Hash Algorithim used to compute a condensed representation of a message or data file.”).
Message-Digest Algorithm 5 (MD5): “An MD5 hash value is a unique alphanumeric representation of the data, a sort of ‘fingerprint’ or ‘digital DNA.’” United States v. Crist, 627 F. Supp. 2d 575, 578, 585 (MDPA 2008) (No. 07-cr-211)). An MD5 also generates a unique, but shorter alphanumeric value than the SHA-1 for a particular file or object.
Digital Fingerprints and Digital DNA
Hash values have unique identification features. Recognizing this role, a number of cases refer to hash value determinations as “digital fingerprints,” including the following cases:
United States v. Chiaradio, 684 F.3d 265, 271 (1st Cir. 2012) (No. 11-1290) (referring to hash values as “essentially, the digital fingerprint” used to compare files)
United States v. Cunningham, 694 F.3d 372, 376 n.3 (3rd Cir. 2012) (No. 10-4021) ("Each hash value 'is an alphanumeric string that serves to identify an individual digital file as a kind of "digital fingerprint."’") (quoting Wellman, 663 F.3d at n.2)
United States v. Farlow, 681 F.3d 15, 19 (1st Cir. 2012) (No. 11-1975) (defendant suggesting how investigators could “have employed a limited search” by “using the image's ‘hash value’ — a sort of digital fingerprint tied not only to a specific file but also to that file's precise location on a computer”)
United States v. Richardson, 607 F.3d 357, 363 (4th Cir. 2010) (No. 09-4072) (describing how the AOL Image Detection and Filtering Program “recognizes and compares the digital ‘fingerprint’ (known as a ‘hash value’) of a given file attached to a subscriber's email with the digital ‘fingerprint’ of a file that AOL previously identified as containing an image depicting child pornography”)
See also United States v. Miknevich, 638 F.3d 178, 181 n.1 (3rd Cir. 2011) (No. 09-3059) (noting how a SHA1 mathematical algorithm “can act like a fingerprint”)
As another means of describing this identification role, some cases have also referred to hash values as a form of “digital DNA”:
United States v. Crist, 627 F. Supp. 2d 575, 578, 585 (MDPA 2008) (No. 07-cr-211) (“An MD5 hash value is a unique alphanumeric representation of the data, a sort of ‘fingerprint’ or ‘digital DNA.’”) (“By subjecting the entire computer to a hash value analysis — every file, internet history, picture, and ‘buddy list’ became available for Government review” and the “examination constitutes a search.”) (granting motion to suppress warrantless search of computer which ultimately had been provided to law enforcement after the defendant failed to pay his rent)
United States v. Beatty, No. 1:08–cr–51–SJM, 2009 WL 5220643, *1 n.5 (WDPA 2009) (in denying motion to suppress evidence seized from the defendant's computer, noting agent's affidavit described "the SHA1 'digital fingerprint' as “more unique to a data file than DNA is to the human body"), aff'd, 437 Fed.Appx. 185 (3rd Cir. 2011) (No. 10-3634)
United States v. Wellman, No. CRIM A 08CR00043, 2009 WL 37184 (SDWVA 2009) (noting investigator described "a hash value or algorithm is '[a] digital fingerprint or a DNA of a file'”), aff'd, 663 F.3d 224 (4th Cir. 2011), cert. denied, 132 S.Ct. 1945, 182 L.Ed.2d 800 (2012)
Degree Of Accuracy
Many of the cases have noted the high degree of accuracy of hash values. In fact, few other evidence matches are as precise. Hash values have been said to be more precise than a match for DNA evidence. State v. Mahan, 2011 Ohio 5154, n.2 (Court of Appeals, 8th Appellate Dist. Ohio 2011) (noting investigator testimony that “that SHA1 values are accurate in identifying a file to the 160th degree, which is ‘better than DNA’”).
The following cases involved evidence suggesting the accuracy of a hash value match exceeds 99 percent:
United States v. Glassgow, 682 F.3d 1107, 1110 n.2 (8th Cir. 2012) (No. 11-2611) (noting “there was a 99.9999% probability that exhibit 1 contained the same video clips that Glassgow possessed”)
State v. Mahan, 2011 Ohio 5154, n.2 (Court of Appeals, 8th Appellate Dist. Ohio 2011) (“There is a certainty exceeding 99.99 percent that two or more files with the same SHA1 value are identical copies of the same file regardless of the file name.”)
United States Nelson, No. CR. 09-40130-01-KES (DSD July 12, 2010) (“When two files have the same hash value, there is a 99.99 percent chance that they are the same file.”)
See also United States v. Cartier, 543 F.3d 442, 446 (8th Cir. 2008) (No. 07-3222) (in challenge to probable cause supporting search warrant, rejecting argument “that it is possible for two digital files to have hash values that collide or overlap”)
Theoretically, it is possible for two different files to have the same hash value (referred to as a collision). But this theoretical possibility has yet to be demonstrated in the real world and is extremely unlikely. For the MD5 hash value, the likelihood is 1 in 340 billion billion billion billion. See, e.g., Richard P. Salgado, “Fourth Amendment Search And The Power Of The Hash,” 119 HARV. L. REV. F. 38, 39 n.6 (2006) (“The range of values generated from commonly used hash algorithms is huge. For example, the prolific algorithm MD-5 can generate more than 340,000,000,000,000,000,000,000,000,000,000,000,000 (that’s 340 billion, billion, billion, billion) possible values. The widely used SHA-1 algorithm generates a range of values over four billion times larger than that. Thus, although there is a finite number of possible hash values and an infinite number of possible data inputs, the odds of a collision are infinitesimally small.”); see generally Data Validation Using The MD5 Hash (“There are actually 3.402 x 10^38 or 340 billion billion billion billion or a little more than 1/3 of a googol possibilities. When you consider that most people have never seen a million of anything the actual number becomes really difficult to conceptualize.”); HashCheck Shell Extension - FAQ (“For 128-bit checksums (MD4, MD5), the probability [of a collision] is an unfathomably small 1 in 340 billion billion billion billion, and for SHA-1, it is even smaller.”).
Generally, courts have rejected questions about the authentication or admissibility of evidence based on remote possibilities unless there is an articulable probability that the validity of the evidence should be doubted. See, e.g., Cartier, 543 F.3d at 446 (while theoretically “hash values could collide ,” accepting government view “that no two dissimilar files will have the same hash value”); see also United States v. Safavian, 435 F.Supp.2d 36, 41 (D.D.C. 2006) (“The possibility of alteration does not and cannot be the basis for excluding e-mails as unidentified or unauthenticated as a matter of course, any more than it can be the rationale for excluding paper documents (and copies of those documents).… Absent specific evidence showing alteration, however, the Court will not exclude any embedded e-mails because of the mere possibility that it can be done.”), rev’d on other grounds, 528 F.3d 957 (D.C. Cir. 2008).
Identification Of Suspected Child Pornography Images
Once a hash value is obtained for a particular file, record or image, it can be used to confirm or locate other matches. In this manner, hash values are commonly used to identify suspected child pornography images. A known library of child pornography images can be used to determine whether suspected child pornography images are used or possessed. If a match in hash values between the known and suspected images is confirmed, law enforcement has used this information in support of a search warrant. See, e.g., United States v. Brown, 701 F.3d 120, 122 (4th Cir. 2012) (No. 11-5048) (hash values of downloaded files used to obtain search warrant in child pornography investigation); Cunningham, 694 F.3d at 376 (hash values were used to identify child pornography images and used to show probable cause to seize the defendant’s computer); Chiaradio, 684 F.3d at 271 (hash values used in an "enhanced peer-to-peer software" to “compare the hash value (essentially, the digital fingerprint) of an available file with the hash values of confirmed videos and images of child pornography”; information was used to obtain a search warrant to seize the defendant’s computer); United States v. Cartier, 543 F.3d 442, 446 (8th Cir. 2008) (No. 07-3222) (hash values were used to identify child pornography images and used to show probable cause to seize the defendant’s computer).______________________________
A few years ago, there were not many cases noting the application and use of hash values. As this review shows, the acceptance and use of this tool for electronic evidence has become more common and widely applied.______________________________
Hash Values Used To Confirm Seized Video Clips And Images
Hash value algorithm was used to show "a 99.9999% probability" of a match between seized video clips and images with known evidence (child pornography images); in this manner the hash value provided "a digital fingerprint of a computer file," in United States v. Glassgow, 682 F.3d 1107 (8th Cir. June 28, 2012) (No. 11-2611)
As we have previously noted, “hash” values are an important tool to identify and authenticate digital evidence. See generally Using “Hash” Values In Handling Electronic Evidence. An Eighth Circuit case demonstrates the use of hash values to confirm electronic evidence at trial.
In the case, the defendant was prosecuted for receipt of child pornography after an investigation led to the identification and seizure of his computer from his residence. Thumbnail images of child pornography were found on his computer. At trial, he challenged the admission of this evidence, arguing that the images "were not expandable for viewing and that the government’s exhibits were only 'similar' to the thumbnail pictures." Glassgow, 682 F.3d at 1109. The type of hash value used in the case is known as "Secure Hash Algorithm Version 1" or SHA-1 which is a 32-digit alphanumeric algorithm. It is considered "a digital fingerprint of a computer file" which is "unique" to the particular file. Glassgow, 682 F.3d at 1110 n.2. After his conviction by the jury, the defendant claimed error in the introduction of this evidence.
The Eighth Circuit affirmed, noting that expert testimony authenticated the images. Law enforcement had confirmed the images found on the defendant's computer with known images from a law enforcement data base. As the circuit explained:
A government expert, however, verified that the images in exhibits 3 through 17 were the actual enlarged images from Glassgow’s computer. To the extent Glassgow is challenging the government’s exhibit 1 (a DVD compilation of three video clips from a law enforcement database), the SHA-1 values of these videos matched the SHA-1 values of the files offered for distribution from Glassgow’s computer. According to the expert, there was a 99.9999% probability that exhibit 1 contained the same video clips that Glassgow possessed. The admission of exhibit 1 (which was not published to the jury, only described to it) was not unfairly prejudicial. Cf. United States v. McCourt, 468 F.3d 1088, 1092-93 (8th Cir. 2006) (published videos were not found to be unfairly prejudicial).
Glassgow, 682 F.3d at 1110 (footnote omitted).
While the case arose in a child pornography prosecution, it demonstrates the reliability and use of hash values to confirm a match for seized digital evidence. The 99.9999 percent probability standard certainly is not required to be satisified to authenticate evidence under FRE 901 which is generally considered not to impose a high hurdle. See, e.g., United States v. Gagliardi, 506 F.3d 140, 151 (2nd Cir. 2007) (noting that “[t]he bar for authentication of evidence is not particularly high”). As the case illustrates, the hash value determination can be an effective tool for the identification and authentication of evidence.______________________________
Investigating Child Exploitation Cases - Getting to Critical Internet Evidence Faster with IEF by, Jad Saliba and Jamie McQuaid, Magnet Forensics
Investigating Child Exploitation Cases - Getting to Critical Internet Evidence Faster with IEF
by, Jad Saliba and Jamie McQuaid, Magnet Foren
VIDEO located at;
Michael Petrelli: Good afternoon and good morning to everyone. My name is Michael Petrelli, with Magnet Forensics, and I’d like to welcome you to our webinar today, entitled Investigating Child Exploitation Cases.
Today we’re joined by Jad Saliba and Jamie McQuaid, also from Magnet Forensics, who will lead us in our discussion. In this webinar, we’ll take you through the steps of obtaining a search warrant to recovering internet forensic artifacts from a suspect’s computer and mobile phone to producing an understandable report that can be passed off or presented in court. Jad and Jamie will also answer your questions in a live Q&A session after the presentation. So please submit your questions into the Q& box in the WebEx client during or after the presentation, and we’ll answer them in order.
This presentation is being recorded, and will be available for viewing at magnetforensics.com shortly after our session today. With that, I’ll turn it over to Jad to begin our discussion.
Jad Saliba: Thanks, Mike. Jamie and I will be taking you through the webinar today, so let’s just get right into it.
The case study that we made up for this webinar starts off with an undercover officer doing an online investigation, and locates the suspects that he engages in conversation online, gets them to start chatting on Skype. The reason for that is that, as we’ll show later on, Skype stores IP address information for people that you’ve chatted with on Skype. It stores that on your local machine, so the investigator is using that means of communication to obtain some information on the IP address of the suspect to further an investigation.
So there’s some chatting that occurs on Skype, then the investigator runs IEF on their own computer, runs it against the Skype files that are on their workstation that was used to conduct the chats, locates IP addresses for the suspect, and through that is able to obtain a court order, may be called a production order in Canada or a subpoena in the United States and other countries. But in any case, a court order that allows us to obtain the name and address of the suspect from their ISP.
Now that we’re armed with that information as well as other evidence that’s been gathered through this investigation, we’re able to obtain a search warrant now, for the residence of the suspect. The search warrant is executed for residence, and there’s a computer on, and we use IEF triage at this point to gather some further evidence. So we’re wanting to confirm the Skype chats and confirm some other information as well as look for any illegal material or illicit images on that machine.
By running triage … and we’ll go through some of the evidence that’s found … in later slides, we’re able to find and corroborate the Skype conversations that occurred, the IP address information, and also identify some illicit images by using hashsets of known child exploitation images. With that information, the investigator is able to effect an arrest at the scene, arrest the suspect, and from there, seizes the computer and also locates and Android device in the residence, that’s also seized.
So we’re going to show some evidence that was recovered from these devices. Some of it will be related to Google searches from web browser history, there’s going to be some torrent files, and some Kik Messenger chat messages that’ll provide some more supporting evidence, as well as the images that were found and matched on hashsets of known child exploitation images.
Jamie McQuaid: Let’s start with the evidence we’ve collected for this case. Obviously, the first thing we’re going to be looking at is the investigator’s computer that contains the initial Skype conversation as well as the IP address for the suspect. This is going to be a piece of your evidence, whether it’s an IEF report or any other tool that you were using. This is going to be your first piece of evidence that you’re going to be relying on. Based off of that evidence, you can get your court order for the ISP, for the name and the address of the suspect.
The next piece of evidence we’ll be discussing is the live analysis triage. Jad will do a demo of triage and show you how live analysis and some of those techniques can help while you’re executing a search warrant at a person’s home or place of residence. Then finally, we’ll discuss the computer and Android devices that were seized at the scene.
Just to recap, the artifacts we’ll be discussing, first up – Skype, obviously, the chat and IP addresses are a big part of this. Then we’ll be getting into the pictures and the hashing techniques and categories that investigators commonly use. We’ll then show some Kik Messenger artifacts that were recovered from the Android device, as well as some Google searches as well. Finally, we’ll discuss some torrent files that we’re able to carve out of unallocated space, and tie that back to the suspect’s activity.
Let’s jump into Skype. The main db and chatsync folders are your primary sources of artifact evidence that you’re going to be looking for in doing Skype analysis. You can see the user profile location or the Skype profile for the main db located there on the main slide. This is for a Windows 7 machine, but the locations are relatively similar whether you’re using another version of windows or even if you’re using an iPhone or Android device. The paths are different, but the main db and chatsync folders are your main sources of evidence.
The chatsync folder contains some additional information that is also reliable, but it was mainly created to help synchronization between one account using multiple devices. Basically, the chatsync folder is saying if you’re running Skype on a mobile device as well as a PC, if your Skype phone device rings on your mobile device and you answer it on the mobile device, it won’t continue to ring on the PC. The technical reasons for it being included aside, it still contains some valuable forensic information for us. Specifically to the IP addresses – they’re stored in the chatsync folders, and dat files, and under the shared.xml. That information again can be pulled out through IEF. Those IP addresses in there will contain both the internal, NAT’ed IP addresses as well as the external IP addresses for both users of the conversation. So for our case scenario, it would be the suspect and the investigator. This will help correlate data with ISPs to determine names and physical addresses.
Jad: The really cool thing here that you’re probably used to getting in your peer-to-peer investigations where you’re sharing a file with a suspect and it will get their IP address, and then, once you have the evidence you need, some sort of court order to get the physical address and name of the person. But from chatting, it’s not something that we’ve been used to getting as part of our investigative evidence up front.
So what Skype is doing here is for users that are being chatted with as well as the local user, it’s saving, like Jamie said, the private address, which could be useful if you are investigating someone that’s doing this from their workplace. So maybe they’re not doing anything illegal per se from their workplace, but they’re chatting, and then at home they’re doing some other illegal activity. If you’ve got the public IP address only, and it went back to workplace, you’d be in a bit of a tough situation to figure out exactly who that was. But combined with their private address, now you’d be able to pinpoint a person and then continue your investigation from there. But the really key thing here is it’s getting that public address. So in the screenshot we’ve got from IEF – hopefully you can read that – Kristy Cooper is our undercover officer, and Reggie Dunlop is our suspect. So this information is from Kristy’s computer, and we’re able to get Reggie’s public IP address, so that we can do that production order or a subpoena to get his actual name and address.
So there’s a date and time associated with each entry, and the way you can read this is just: you’ve got a username at the far left, and then whatever else in that record is associated to that user. So the first record, we’ve got a local IP address there for Reggie. It’s indicated in the third column that it’s a local address. And then the Date/Time that that address appeared for that user. So you can combine that date and time with the IP address to fill your court order. And then further down we have … second row down we’ve got the public IP address, and that’s indicated as well.
Doing the live analysis piece of this case study that we put together as part of this [indecipherable] the case study we’re executing the search warrant, and we’ve got the suspect’s PC that we want to do some initial triage work on. So a couple of things we want to do – as you all know, it’s confirmed that we’ve right place, we haven’t kicked in the wrong door, and that we can get some evidence off the devices that are in that residence to validate all of our work. So we want to make sure that we’ve got some of the Skype conversation that we conducted and identify any known illicit images using some hash analysis.
So in version 6.3 of IEF we added the ability to load in hashsets of your own, just text files with one hash value per line, and they you’re able to assign a category to that hashset. I’ll demonstrate this in a minute.
We’ve also added support for [Project Vic], and great initiative that’s being led by [indecipherable] and [DHS]. I don’t know if [Rich Brown] or [Jim Cole] are on the call right now, but they’ve done some great work here with helping consolidate lot of these hash databases and kind of improve the workflow, to ensure that we’re not missing anything when dealing with these types of cases. So if you’re on that project and you’ve got a hash that’s through that, we can also import those now. And for all these you can assign alerts. So if you want to know immediately when a picture is found that matches on a hash value, that’s either in a [Project Vic] hashset or one that you’ve created and imported, you can either get an audible alert, and you’ll get a window popping up with all the items found that match that criteria, or an email, which doesn’t really apply for the triage scenario, but just something that you can use in the lab.
So we’ll just jump out into IEF here, and just quickly demo-cam what I talked about there. This is the main screen of IEF, for anyone who hasn’t used it before. From here, we can first set up our hashsets. We go to the Tools menu, we go down to the second-last item there, called Hash Sets. This is where we can set up all of the hashsets that we want to use. Anything that I load in here now is going to be saved. So I can do this once, preload a number of hashsets, and then the next time I run IEF triage, all this is already good to go, I don’t have to do it twice or every time that I run IEF.
So if I just have a text file with some hash values, I’ll use this bottom part of the screen here to import, remove any that I want to remove down the road, or remove all. So I’ve got a sample list here of Category 1s. These are just some made-up hash values that we … we used just pictures of bears for this case study and created our hash values according to that. So it’s asking here what category do you want to set all these hashes to – so I’m going to call these category 1s for the purpose of this example. If you had number of other hashsets for category 2, 3, whatever you use in your region, you could import those as well, and give them a different hash category.
So I’ve imported that now, it’s in the program. If I want those alerts, I can just check a few of these boxes off, and then I will get the alerts as the search is conducted and it finds any matching values or pictures from this file, it’ll pop up, alerting me with an audible sound and another window showing the pictures that matched on that hash value.
For [Project Vic], very similar – Import, select a file, and then just as you get more delta files from updates on [Project Vic], you can just import those and they’ll get added in. We’re going to add some features showing how many records are currently stored, and the last time you updated. There’s also some other helpful features here.
That’s basically it. We’ve got that all set up now, we’ve got it enabled here, at the top. If I uncheck that, everything stays, but it just won’t use any of these hashsets during the search, and then I can just turn it back on later on if I’d like to. So that again saves you time, but lets you be flexible in how you want to do your searches.
So in this scenario, we would probably be searching the C: drive [from the live] machine, we want to search the operating system drive. Default is the Full Search. I can do a triage search here, and that’s just going to search the common areas and folder locations. Depending on how much time you have, you can do whatever works best for you here. Obviously, the more comprehensive search that you select, the less likelihood of missing anything. But if you’re under some time constraints, this is your fastest search right here. If you have a bit more time, you can do the Quick Search, or I would recommend going with a Custom Search, and then maybe unselecting everything and then just going with ‘All Files and Folders’ – so not going into unallocated, but just grabbing all the live files and searching through all of them.
We’ll leave that on there for now, got our search added, go to the next screen. Again, those familiar with IEF will know this is where you can select all the artifacts to be searched, all the different artifacts that we support are listed, and by default everything is checked, but you can uncheck anything if you want to speed up the search. Again, in this sort of situation, that may be the case. So if you’re very specific on what you need to find during this preview, you could unselect everything and then just select certain groups. So I can uncheck all at the bottom here, and then say I’m really interested in web browser history and run the related evidence, I can double-click on this heading here, check the entire category, and then it’s just going to search for that. And then maybe I’ll add pictures and video.
So you’ve set that up however you like, and then on your last screen, just asking where you want to save the data. So it’s defaulting to where I’m running triage from right now. That way it just defaults right to the [indecipherable] that you’re running triage from. You can change the folder name, enter a case number, and enter an evidence number for the drive that we’re searching. Then click ‘Find Evidence’ and off we go.
So just bringing up a completed case for this case study – we’ve got pictures here, we’ve got about a thousand pictures that were recovered. At the top here, I can select which hash categories that have been identified that I want to see. So right now it’s showing everything. If I deselect everything, all the pictures go away, and then I can just select a specific category.
So I’m going to go with Category 1, just to see which pictures matched on that hashset that I imported. These are our sample Category 1 pictures. If I click on one, scroll down, I can see all the metadata. This doesn’t happen to have any, but if there was a make or model, GPS information in the picture, we would display it there. And then we’ve got our hash values. So we’ve calculated MD5, SHA1, and a PhotoDNA hash here.
The Skype evidence that we want to look at I can jump into by clicking on ‘Chat Threading’. And I can see here some chat messages between our suspect and our UC officer. Click on that, and we’ve got a nice, threaded view of all the messages, similar to what they look at when you’re actually in Skype, and using the Text Messaging feature, the chat feature. So we’ve got all our messages there. We quickly confirm that. We can go to Skype chat messages, to look at them in the traditional view, get some more details, and here’s where we can also look at the IP addresses that we saw in the screenshot.
So we’ve got our usernames again, our IP addresses, the address type, and the Date/Time that they were noted. So we’ve confirmed all our information here, which is really great to be able to do that in potentially just a few minutes, by running triage on the suspect’s live machine.
As I mentioned, IEF supports hashing pictures on any device or any data that you throw at it. We can import multiple hashes, and we support the MD5, SHA1, and a PhotoDNA algorithms for doing matching and for calculating those values for you for all the pictures. And we can integrate with Project Vic fairly well right now.
Michael: Alright. Thanks, Jad. Based on all of the information found, the Skype conversation, and the known illicit images, the investigator is able to make an arrest on the spot. The suspect is then arrested, and the PC and mobile device are seized for further analysis, so that we can take those devices back to the lab and dig a little deeper on both of those, spend a little bit more time on it than you could at the suspect’s house.
The first artifact off the mobile device that we want to talk about is Kik Messenger. Like a lot of mobile chat programs, it uses an SQLite database that’s very common for mobile databases, and it’s named kikdatabase.db. The location is listed there, for the Android device, and you’re able to pull off quite a few details, including contacts, messages, timestamps, as well as the status, and any attachments from Kik Messenger. We can see from the screenshot below – this is from our example, but these databases aren’t encrypted or anything like that, so they can be viewed with any SQLite viewer that you choose to use. But the challenge is always just pulling the database out. Once it’s extracted you can view it with anything you like. So let’s look at the actual evidence.
Jad: So we’ve got … in this case, what we did with the data was we did a reset, and a sector-level search on this phone to show how much data that is still available on the phone, even after doing the data, resetting the phone, and so on. So these are all carved messages for Kik Messenger. You can see we’ve got the message type, whether it was sent or received, the partner, so the remote user that this person who was on the phone was chatting to, the status – if it’s been sent or read – and then the message body itself, and time. So lots of great information to be found there, and just jumping back, you can see the letters ‘a3k’ after the username – _a3k. That seems to be some sort of identifier, potentially what kind of device the user was on. If you’re on a PC you’ll see something different. Just something worth noting there.
Jamie: The next artifact that we found for this case is some Google search results. This is typically your regular browser artifacts, but you can look at the URL from a Google search, and pull out the search terms that the suspect was looking at. Again, the browser data is stored in an SQLite database called browser2.db, at the location below for the Android device, and again, it’s very similar to any browser type artifact that you would be looking for, but it’s specifically coming from Google.
Jad: As Jamie was mentioning, you can find the search query right inside the URL, and I have pulled those out for you to make it easy for you to identify those without having to look at these long URLs with a lot of metadata and try to piece out the search terms. So you can see there, you’ve got the search term pulled out, the search engine identified, and you can see a bit of the URL itself, and you click on that item, you’ll see the full details for the URL and be able to verify what IEF has pulled out of there. The tricky thing with Google searches is that it is potentially possible to create a link that is a Google search, and someone clicks on the link thinking that it’s one thing, and it sends them to a Google search of something else. And then that would come up in their history. So just something to keep in mind – being able to state definitively that the user typed in the search term requires a little bit of extra work just to validate that it wasn’t some strange website that created that web history on their computer.
Jamie: The next artifact that we found on the suspect’s computer are some torrent files. Most people understand what a torrent file is. It’s files containing information, metadata around the files and folders that are shared across P2P networks. This is often used for legitimate and illegitimate purposes. They’re identified with the .torrent extension and they come along with any sort of media sharing or file sharing service on the P2P networks. Specifically for us here, these are searched by, like I said, the torrent extensions as well as the headers, which are slightly unique, that you can carve out of unallocated space or search allocated space for as well.
Jad: [indecipherable] again show you an example of some of the data you can find. You can see a bit of the data [in that raw view] on the last screen, but this is what IEF has parsed out of the data. And again, we’re just carving this out of unallocated space. So this is just a torrent file that we doctored up to have a specific name, and then inside you can see the files that are included in the torrent, filenames, sizes, and obviously, in your actual investigations, there’s quite a few torrent files that could be downloaded that would have some very indicative filenames of the torrent itself as well as the files that are included in the torrent, which could be good supporting evidence. And then we’ve got the created time there as well.
Another thing you can do within IEF – once you’ve located a number of files, you may have had some hash matches, and then you may also find, through your own review, a number of files that are of an illegal nature that you now want to report or put into a database, and you can do this through IEF by using the Export feature which we recently added. You can take a number of pictures, export them to a folder, and then either hash those pictures separately or submit them to an appropriate agency. You can use tools such as C4All on that folder, or [NetClean], and do further analysis from there, or categorization, and then submit your categorizations to the centralized hash database that you use or something like Project Vic. So just another thing to note that you can use from within IEF when recovering pictures.
Michael: Okay, so let’s bring it all together and bring this into a quick demo of using IEF Timeline. I find doing timeline analysis really helps show the entire case, and helps visualize it for both the investigator and any stakeholders that are involved, whether those stakeholders for our case here might be a judge or a jury, or if it’s a corporate type case, they would be more of management or a legal team or HR. It doesn’t really matter, but it really does help visualize for not only the investigator but any stakeholders involved.
We can see here we’ve got Timeline up, and I’ve prepped this up a little bit. I’ve selected – we’ve got pictures, torrent file fragments, Skype chat messages, and Kik Messenger messages. You can see at the top, all results, there’s a lot of activity that was going on during those events, but if we spread it out and look at the specific artifacts that we’re looking for, we can see a pretty good timeline of events and how it occurred, starting with the pictures as we take a look here, pulling it up. We can see the pictures of bears here, which, like Jad had mentioned, we’re using as representation of the illicit images. There are a number of pictures here pulled down on the 6th of May. So this is several days prior to our initial engagement.
So we’ve got the pictures, then we’ve got next, on the 8th, the torrent file fragments a few days later, again some questionable behavior, and definitely illegal, then we’ve got the Skype chat conversations. So we can pull in here and we can see this as the initial engagement for our case study, where Reggie Dunlop, our suspect who was speaking with the undercover officer/investigator – and we can see that conversation happening here, on the 14th. And on the same day as well, we can see there’s the additional Kik Messenger conversation that the suspect had with another third party that wasn’t our investigator. So bringing all this together, it really helps show a timeline of events of how things occurred, it shows that our suspect was engaging in this activity well prior to the investigator engaging him over Skype. This really helps showcase all that information.
Jad: Yeah, and this can be great to either find activity on data you didn’t expect to find, or if you do have certain data you’re very interested in, you can just zero in on that timeframe and then see all the different activity in a chronological order that happened on that timeframe. In this case, this kind of represents the escalating behavior of someone that’s looking at child exploitation images, downloading some torrents relating to illicit images and videos, and then engaging in some conversations online that are inappropriate, illegal conversations, where they’re attempting to lure someone over the internet. So a lot of great information that you can find through timeline analysis, and sometimes find some activity or behavior that you weren’t expecting to see.
Just to summarize what we found: we went through an investigation where an undercover officer was engaging with someone online, got them on Skype so that we could get the IP address of that suspect. We got the IP address, got some court order to get their address, name, did a search warrant on the address, found some more evidence, confirmed what we already knew, and then took that back to the lab and did further examination, confirming illicit pictures, some other conversations, Google searches that are relevant to the case, and torrent files.
And techniques that we used here are live system triage – so if you’ve got a live system and you’re executing a search warrant, it’s great to be able to grab some of that live data and have some information right away that you can either use in bail court or also capturing the live RAM to take back to the lab to do further analysis on, where you can recover a lot of data that wouldn’t otherwise be available on a hard drive if you just simply shut down a machine and take it back to the lab. So that’s really key there.
The hash analysis that we went through to make it easy to quickly identify known child exploitation images, and then chat threading and timeline analysis to help filter through the data, make sense of it, present it in court in a meaningful manner that’s understood by a judge and jury or investigator who you’re passing the data off to.
So that kind of summarizes what we had here in this case study that we made up here. Hopefully a lot of this information has been useful to you, and maybe it’s given you some ideas for your future investigations, or even ones that you’re currently working through now.
Michael: Great. Thanks a lot, Jad and Jamie. We’ve attempted to show everyone how IEF has helped find critical artifacts in this case and sped up the recovery process, ultimately saving the investigator’s time. So these are the benefits that are experienced by thousands of forensic professionals as they use IEF to recover and analyze hundreds of artifacts on computer, smartphones, and tablets. So if you haven’t already tried IEF, we’d like to offer you a free, 30-day trial, and encourage you to try it on new or existing cases. You can visit magnetforensics.com/trial to download your free copy today.
We’re now going to jump into our Q&A session. So if you have any questions, please submit them into the Q&A box in the WebEx client, and we’ll answer them in order. There are a few that have come in already throughout the presentation, guys, so we’ll ask a few of these to Jad and Jamie, and those two guys will help us answer these here ourselves.
One of them is here: Do you have hashlists or sets that you can make available to the audience?
Jad: Unfortunately we don’t. We’re working on building in some freely available hashsets that allow us to whitelist certain files, not related to pictures, but more things like operating system files, and that’s something that we would include, and what that’d let us do is skip over files that are known good files, that won’t have any user data or anything like that. We don’t have our own hashlists and we haven’t been able to get access to any. What I would direct your towards is any regional labs that are in your area that maintain their own hashlists, Project Vic of course, which, if you’re eligible to join that project, they’re building a really clean set, well organized set that’s been verified and re-verified. And what they’re building there is a consolidated list, so that instead of having multiple silos of lists around the country, we can have one list that everyone uses, everyone contributes to, and that just helps make the identification proess much more thorough and streamlined. So those are my suggestions to get access to some hashlists.
Michael: Okay. Thanks, Jad. Next question here: Do you know if that contains remote port information for connections through carriers such as [indecipherable]?
Jad: I don’t believe that the port information is stored. It’s something we can double-check, but we’d have to reverse-engineer all the data that’s in these chatsync dat files, and all we’ve been able to find so far, beyond some of the other information like messages and so on, related to IP addresses, is just a date and time, and whether it’s a local or public IP address.
Jamie: Just to add to that, Skype uses its specific ports on installation, so it’s set up to know what port it’s going to be looking on. So it’s not a variable that changes too often, but yeah, the IP addresses are what’s mostly there, in terms of valuable evidence anyway.
Michael: So on the Skype theme there, there’s a question: Is the Skype IP address there by default? Does the user have to send a file to capture this info?
Jad: The information comes through messages that were sent, so as long as you can engage in a few messages … the chatsync files are a bit of an anomaly. Sometimes you may create chatsync files … definitely over time you will, but if you had a really quick chat, one or two messages, you may not find chatsync files in your folder. So if you can engage in at least, I would say – and this is just a guess – 10 to 15 messages, you should have some chatsync files that will contain that information. And there’s no setting to turn on or off those IP addresses in the chatsync files, it’s just something that’s built into Skye. So that’s kind of the nice thing, that if someone is aware of these IP addresses, there’s no way for them to turn them off.
Michael: Okay. I think in the same vein here, there’s a question: Can you just drag and drop the main db to get all of the IP information?
Jad: The main db, while it does contain a lot of very useful information, is not the file that stores the IP addresses. So for every user account that’s on a hard drive for a Skype user, you’ll have a folder name that’s named after their Skype user name, and under that folder name, there’s another folder called ‘chatsync’, and under that folder, more subfolders with dat files, and those dat files that are in the chatsync file format are the ones that contain the IP addresses. So if you did just want to look just for that, you could point IEF at the chatsync folder and recover all the information just from those files, if that’s what you wanted to do.
Michael: Okay. Thanks, Jad. Another question on IP addresses: How is the IP obtained if the user is behind Tor or another service?
Jad: If the user is on Tor, you are going to get the IP address of the exit node that they are currently using on Tor. So if someone is using Tor, as long as they are using it correctly and everything is configured properly so that all the Skype traffic is going through the Tor relay, you will only get the IP address of the last Tor relay or what’s called the exit node on Tor. If they’ve misconfigured it or it’s not quite working properly, which is quite possible with something like Skype – they do have bundles for Tor that boot a preconfigured vision of Firefox that’s configured to use Tor very safely… I don’t believe there’s one for Skype, so it’s something that they’ve had to set up themselves, and it’s very likely that they may not set it up correctly.
Michael: Okay. Next question: Is it possible to see how many times a torrent file has been uploaded from the user? Now, there’s a big difference on downloading child pornography and distributing it where this user lives.
Jad: Within the torrent files that we discussed today, those are just the files that you would download to get access to the files containing that torrent. There’s no upload information stored in those files. Where you may find upload information is from configuration files for the torrent client that that user is using – so something like µTorrent, bittorrent and so on. They contain configuration files that may show you information on what the user has shared out themselves. And if they’ve downloaded a torrent, they’ve likely shared at least a piece of it out, because that’s how the torrent network works. But that’s something that we’ve got on the list to add to IEF, to give you more context around information like that. And I understand what you’re saying – it’s a much more different thing to prove that they’ve uploaded child pornography than just downloaded it.
Michael: Okay. So we’ll move on to our next one here, dealing with web surfing. How would you determine that someone actually went to the web page and it just wasn’t a page that accidentally opened.
Jad: Depending on the browser you used, there’s metadata with the history record that can tell you some information around how they got to that record – whether they clicked on a link, they were redirected, which would be similar to a popup. So depending on the browser, you may be able to show that it was something they clicked on or even typed in themselves, versus a redirect. If it’s a browser that doesn’t store that kind of context, then it’s a little bit more difficult to definitively say that it wasn’t just a popup window, and you’ll have to look at either using timeline analysis or sorting by dates and times on the web history, look at the context around that, that link, and see where were they just before that, maybe visit the site and see what kind of activity is there, are there popups and so on, to do essentially a further investigation on that activity yourself.
Michael: Okay. Now, we mentioned Project Vic a few times through the presentation. Are you able to clarify what that is for our audience?
Jad: Yeah. I certainly am not an expert on all the details of the project, but my understanding is that it’s a project to consolidate all the different hash databases that exist within law enforcement for child exploitation images, get rid of any false positives, any duplicates, and create one single hash database that everyone that’s investigating child exploitation can use. So the benefits of that is it’s one area to get your hash database from, and then one area to share your categorized images to, so that everyone can benefit from what everyone else is finding. And what you can use that for, if you’re not used to using hash databases in your child exploitation investigations is that you can use that hash database to pre-categorize a number of images.
So if you’ve recovered, just as an example, a million images from a hard drive, instead of having to go through every single image, manually determining if it’s legal material or not, you can use these hash databases to pre-categorize a number of pictures, which may save you quite a bit of time and effort having to go through all the pictures manually, having to view them again, and some of the trauma associated to that, and then quickly just finish off the uncategorized images yourself, saving a lot of time.
I believe that’s the main goal of Project Vic, and it’s also, I believe, [kind of the term] No Child Left Behind is associated to it as well, where making sure that all child victims are identified in these investigations and no child is left unfound because a picture was missed or the caseload was too high and you weren’t able to do as thorough an examination as you would have liked to on that case.
Michael: Okay, thanks. Good explanation, Jad. Can IEF detect hidden images such as steganography?
Jad: Depending on how, what kind of steganography is being used, we do carve through files for images, so even if it’s named something else or unallocated space and so on … so if they’ve hidden it in a file that doesn’t appear to be a picture, but the picture itself is still intact inside that file, using whatever steganography tool, we would still carve that file looking for pictures, recover it, and tell you that that picture was found within that file. So it really depends on how complex or advanced that steganography technique is, but if it’s a simple placing a picture inside another file, we should be able to recover it.
Michael: Okay, good. The next question is about [right-blocking] capabilities of IEF … and in the initial part of the presentation, where we were connecting to the suspect’s computer, were we inserting a dongle to which evidence is downloaded to? And can we also comment on the [right-blocking] capabilities in that scenario?
Jad: Yeah. Triage is run off a dongle, so plugging it in does create some [USB] entries associated to that dongle. As far as [right-blocking] goes, what we do is, the way that we access files is by going a layer underneath the file system and manually parsing the NTFS or FAT file system from within IEF. So we’re not using Windows process or APIs, the typical way that a program would open a file and potentially change the last access time or even potentially cause some data to be changed. We’re just doing it through a read-only method internally that doesn’t even access the file system APIs, and then we just, using the MFT on the NTFS file system, locate the file, read the raw sectors for that file, and that avoids anything being changed, any metadata like access times being changed.
And we also have a feature in triage that if you are doing a covert search of this computer – so this is related more to some of the military customers we have that are doing operations where they’re searching a computer and have a savvy suspect that may check their computer for activity later on, and they don’t want any traces of activity to be left behind – we have a feature called stealth mode that will remove those USB traces and [pre-fetch] files that are created by inserting the dongle and then running IEF. So it’ll remove those traces to remove any indication that IEF was run.
Now, in a law enforcement scenario that’s not probably something that you want to do. It’d probably be better just to explain in court that those artifacts were created by using IEF, they don’t impact the data to any extent, and then just explain it that way. I think that’s probably easier to deal with in court than talking about deleting files or registry entries.
Michael: Thanks, Jad. Next question: Is there a way to tie pictures to Kik Messages, sent and received pictures?
Jad: Yeah. We didn’t touch on it in this scenario, but there’s another database that contains Kik attachments. In our last release we added the ability to correlate those messages to a user, dates and times, and show those attachments, whether they’re pictures or something else associated to the user’s [indecipherable] the attachments.
Michael: Okay. Next question around picture hashing: Can a quick search for one picture hash be done?
Jad: Yes. Yes, you could import a file with a single hash, give it a category or no category, and then just run a search just for pictures, and every picture would just be compared to that one, single hash, and only identified and given a category if it actually matched on that one hash value.
Michael: Okay, good. What do the skin tone values in IEF mean or represent?
Jad: For every picture that we recover – and it’s an option you can turn on or off prior to starting the search – we calculate a skin tone percentage. So using a number of algorithms that look at the colors in the picture and compare them to skin tones and multiple calculations that identify if there’s skin tone present in that picture, and then given a percentage. So if there’s quite a bit of skin tone, you’ll have a higher percentage there. So typically, pictures of nudity will have a high percentage of skin tone calculated. And then what you can do is, there’s a skin tone filter at the top of the report viewer that you can set to a certain percentage as a minimum.
So if you set that to 55%, you’re only going to see pictures that have at least 55% of skin tone calculated for that photo. And that can help you quickly, again in a triage type situation, get to the pornography pictures, whether they’re illegal pornography or legal, and review them that way. Because of how skin tone works, you’re also going to get a number of pictures that have tones in them, colors in them that are similar to common skin tones out there. So just a caveat there.
Michael: Okay. On hashing again – will the hash change if the photo is cropped or resized?
Jad: Yes, it will. So if you change one byte or one pixel in the picture, the MD5 or SHA1 hashes that are mathematical hashes will change. And that’s the great thing about photo DNA. So we’ve recently added that new kind of fuzzy hashing ability that Microsoft developed and has provided to people that are providing software that assists investigators that are combating child exploitation. So the photo DNA hash, there’s a number of operations done on the picture, so that you’ll still get a match saying that these pictures are similar even if you crop the picture, if you change it to black and white, even if you change a piece of the picture or resize it, most of these operations will still result in a hash match using photo DNA. And that’s kind of the problem that was seen by a number of people in law enforcement that spoke to Microsoft, who were good enough to develop a solution for that. So photo DNA hopefully will be used more and more in these types of cases to help combat those kinds of issues where pictures are being resized or slightly modified to avoid detection.
Michael: Okay, another time zone question here: how do time zones themselves and the differences affect the timeline?
Jad: I believe in the timeline, everything by default we’ve tried to set to UTC time, and then you can set a timezone in IEF, which will apply a plus or minus amount to those dates and times, and then if there’s daylight savings involved, depending on what time of the year that date falls, will apply that daylight savings offset as well. You can set that if you’d rather see things in your local time zone, or you can leave things in UTC.
Michael: Okay. Next question: Does this work on Linux or Unix platforms?
Jad: Triage is Windows-only right now. We’re working on making it bootable, so that if you do have a Linux or Mac machine, you can either shut it down, or if you come up to a machine that’s already shut down, including a Windows machine, you could boot up using the triage thumb drive, and then run IEF on that machine in a forensic manner that would mount everything read-only, giving you additional reassurances that nothing’s being changed, but that would also allow us to run on Linux machines and Mac, because we can handle the … because we do all the file system parsing and interpretation manually and natively, we can handle the Linux file systems like the [EXT 3 and 4] and HFS file systems on Macs. So we would still be able to interpret the file system and do the search which need to run on a Windows platform. So once we get that bootable option working, it’ll extend IEF’s ability to those other platforms.
Michael: Similar vein: What file systems does this read? For example, FAT, NTFS.
Jad: Yeah – ext 2, 3, 4, HFS plus HFSX, FAT, FAT32, and NTFS … XFAT, and [indecipherable] file systems, which are found on Android devices. I’m probably missing some, but most of the common file systems are supported.
Michael: Okay. A question on interoperability here with other platforms – are we still working with [guidance] such that all IEF results can be imported to [EnCase] for reporting? If so, is this as easy as a drag-and-drop or is it importing results one at a time?
Jad: Yeah, we’re still working with [guidance] that way. We’ve got a couple [end scripts] that you can run. We’ve got an import script, so it’s a script that, if you’ve already run IEF on a case, created an IEF case and you want to import those results into [EnCase] you can run the script and just point it at the IEF case, and it’ll pull all the results into [EnCase]. The other scripts, which are for version 6 and version 7 of [EnCase] will just launch IEF, run it against your images, and then bring the results back in, or export them to Excel files if that’s what you prefer. And then we also have a case processor module for [EnCase7]. So if you’re using that pre-processing module ability, you can include IEF in that pre-processing, and that can have it run against all of your items as well, bringing the results back into [EnCase]
Michael: Okay. Thanks, Jad. The [parse-search] queries are a useful tool, but there are often junk terms found such as search terms. Is there a good method of weeding through these junk terms?
Jad: Yeah, we’ve been looking at that, trying to find a way of removing the junk terms without getting rid of anything legitimate. That’s the difficulty, it’s we don’t want to remove any positive hits while we’re trying to remove the false positives. I think sometimes just sorting by the search term can be helpful, and you can find things that were typed, kind of grouped together. Or sorting by date and time can also help, you can see the searches that were done in a certain timeframe altogether. But that’s something that we’re looking at to see if there are certain ones we can put in a list that we can always filter out. That’s something that we’re going to try to do.
Michael: Okay. Very good. So we’ve come to the top of our hour here, and we’d like to thank Jad and Jamie again for the presentation and for taking us through the Q&A session here. We’d like to thank everyone online for attending and participating in the discussion today. If you do have any further questions, feel free to email them to Jad or Jamie at the addresses we have listed here. The session was recorded, and it will be available for viewing at magnetforensics.com in our Webinar page within the next 24 hours. We’ve also captured all of the questions that have been sent through. So we’ll be getting Jad and Jamie to answer those, and we’ll post those to the same site by the end of the week.
So thank you again, everyone. This concludes our presentation today.
Jad: Thanks, everyone.
End of Transcriptsics