Found in 7 comments on Hacker News
specialist · 2024-03-06 · Original thread
You are emphatically, logically, ethically, technically, securely, and in all other ways correct.

Yes and:

> Encrypting data while at rest (in storage) as well as in-transit is the way to go.

All PII must be encrypted at rest at the field level.

Just like how passwords are properly stored. This is not rocket science.

The book Translucent Databases demonstrates this technique for common use cases. Highest recommendation.

https://www.amazon.com/Translucent-Databases-Peter-Wayner/dp...

specialist · 2022-07-19 · Original thread
The great irony is that actual privacy requires unique identifiers, like RealID or equiv.

GUIDs unlock the Translucent Databases achievement, actual per field encryption of PII data at rest. TLDR, clever applications of salting and hashing, just like with proper password storage. https://www.amazon.com/Translucent-Databases-Peter-Wayner/dp... http://wayner.org/node/46

I was utterly against RealID, until I figured this out. Much chagrin. Super embarrassing.

Source: Worked on both electronic medical records and protecting voter privacy. Did a translucent database POC for medical records, back in the day.

If there's another technical solution, I haven't found it.

But I think to your point, people generally don't want the sensitive data being collected in the first place. I don't have an answer for that.

specialist · 2021-12-30 · Original thread
Two tangential "yes and" points:

1)

I'm not smart enough to understand differential privacy.

So my noob mental model is: Fuzz the data to create hash collisions. Differential privacy's heuristics guide the effort. Like how much source data and how much fuzz you need to get X% certainty of "privacy". Meaning the likelihood someone could reverse the hash to recover the source identity.

BUT: This is entirely moot if original (now fuzzed) data set can be correlated with another data set.

2)

All PII should be encrypted at rest, at the field level.

I really wish Wayner's Translucent Databases was more well known. TLDR: Wayner shows clever ways of using salt+hash to protect identity. Just like how properly protected password files should be salt+hash protected.

Again, entirely moot if protected data is correlated with another data set.

http://wayner.org/node/46

https://www.amazon.com/Translucent-Databases-Peter-Wayner/dp...

Bonus point 3)

The privacy "fix" is to extend property rights to all personal data.

My data is me. I own it. If someone's using my data, for any reason, I want my cut.

Pay me.

specialist · 2021-08-04 · Original thread
All PII must be encrypted at all times. At the field level.

Translucent Databases explains how.

https://www.amazon.com/Translucent-Databases-Peter-Wayner/dp...

http://wayner.org/node/46

Source: Was once an insider. Created and ran electronic medical record exchanges.

specialist · 2019-08-24 · Original thread
re: IRMA

I've been thinking about negotiated disclosure since the mid 90s. Back then we called it faceted personas. In an effort to protect oneself from aggregators of demographic data.

I've gotten nowhere.

TLDR: 99% certain deanonymization will always prevail.

Not saying I'm right. I'm not particularly smart or insightful. I just try to apply ideas foraged from academia to real world problems. Alas, the times I've slogged thru the maths and algos, I'm always left befuddled. I'm just not clever enough to figure out all the attack vectors. (I'd make a terrible criminal.)

--

re: Privacy by Design

That means Translucent Databases. Where all data at rest is encrypted. Just like you salt and hash password files.

This book details clever applications of that strategy to real world problems:

https://www.amazon.com/Translucent-Databases-Peter-Wayner/dp...

Mea culpa: I'm still unclear how GDPR's tokenization of PII in transit works in practice. Anyone have some sample code? And I still don't see how it protects data at rest.

--

Source: Design, implemented, supported some of the first electronic medical records exchanges (BHIX, NYCLIX, others). Worked on election integrity for a decade, including protecting voter privacy (secret ballot).

--

Prediction: Accepting de-anon will always win in the long run, we'll eventually also accept that privacy has a half-life. To adjust, we'll adapt differential privacy algos to become temporal privacy.

specialist · 2013-02-01 · Original thread
This rule clarification is good in that it acknowledges the participation of third parties. Yay!

But it doesn't change the fact that HIPAA is just kabuki (for show).

I worked on some of the first RHIOs (regional health information exchanges) on the market. We all had yearly HIPAA training. All platitudes and very little actionable advice. As devs, we all had full access to millions of patients.

Accidental disclosure is inevitable. So many participants, so many systems, the weakest link and all that. We all figured it was a matter of time before something bad happened.

I care about privacy. A lot. I researched what's what, legal and technical. Because I want to do a good job. And I have skin in the game (my own medical history).

The month I started on the electronic medical records project, a local hospital had just settled for allowing 100,000s of complete patient records leak. (A stolen laptop.) So I contacted the lawyers on both sides. Verdict? Try harder next time.

Pretty much nothing has changed (improved) since. Except the disclosure requirements, I guess.

This is a long topic, so I'll just skip to the conclusion:

We will not, cannot protect patient privacy until we assign a universal unique identifier for every single person. This means something something akin to RealID.

To protect patient privacy, we need to encrypt the data. But that's not feasible without globally unique identifiers. Because patient demographic data is dirty and mismatched record can be fatal. So you have matching algorithms that have to look at the original plaintext. And the heuristics are wrong enough that the process requires human oversight.

If we (the USA) had unique identifiers, then we could transition to translucent database designs. That'd be very cool.

http://www.amazon.com/Translucent-Databases-Peter-Wayner/dp/...

About once a year, I go to a "future of healthcare IT" event. I desperately want to hear that patient privacy is being addressed. Hope springs eternal. Mostly, no one knows what I'm talking about. Until you've worked on the systems and tried to actually implement privacy safeguards, people just don't grok the problem domain, and continue to believe it's a trivially solvable problem.

johnm · 2008-06-01 · Original thread
In practical terms, just use HTTPS with rooted certs. It's not expensive for basic usage. And, if you're just doing a for-fun project you can always use self-signed certs.

If you want to go deeper down the rabbit hole check out SRP: http://srp.stanford.edu/

In terms of dealing more securely with data on your server, check out the book, Translucent Databases ( http://www.amazon.com/Translucent-Databases-Peter-Wayner/dp/... )

Fresh book recommendations delivered straight to your inbox every Thursday.