« Mid-Morning Art Thread [Kris] |
Main
|
Andrew McCarthy: This "Indictment" Does Not Even Contain an Actual Indictment »
April 05, 2023
Wednesday Morning Rant [Joe Mannix]
The Coin of the Realm
Late last week, Italy temporarily banned the use of Open AI's 'ChatGPT' software. Why? Privacy and compliance with the EU regulations on data privacy (the GDPR) , and because it has no ability to differentiate minors and so can expose them to regulated or controlled content. The common thread running through much of the commentary on this development is that Italy actually took this action because of the competitive threat AI poses, and the privacy concerns are a smokescreen.
Yet it is not without legitimacy. On March 20, Open AI had a data breach. Users would receive conversation information from other users, and some personally-identifiable information was exposed - include names and billing addresses of users. This raises some definite red flags over data management and controls within the system, as well as what is collected. How those data are used is also in question:
It said there was no legal basis to justify "the mass collection and storage of personal data for the purpose of 'training' the algorithms underlying the operation of the platform".
That's the real rub. AI requires data. Lots and lots of data. The more data available, the better the outcome (assuming the model works). But where to get those data? That's a problem, and it's a problem that Big Tech doesn't address. They can't address it, because to do so would challenge the technocratic, smartest-guys-in-the-room conceits of those organizations and individuals. Where to get the data? From everywhere, of course. On what authority, and by what right? Their own, of course.
High tech treats data with contempt and entitlement. They hoover it all up and don't bother to protect it (except, when possible, from each other). Almost every organization has internal policies and controls that make tremendous sense. These policies usually include concepts like the Principle of Least Privilege, under which anyone who does not need particular data should have no access to it. They also include rules around retention. That which is not essentially needed should not be retained (or collected in the first place). These are routinely ignored.
The law sometimes tries to tackle this problem. The EU's terribly-implemented and overly-complicated GDPR is one such attempt, and it dominates globally because it is the most restrictive and covers so many people. California has its own regulations, as do various states and countries. All of it ostensibly comes down to balancing data privacy against the interests of states and corporations. The individual, unsurprisingly, loses in these efforts.
Data sets - the more the better, about as many people and things as possible - keep the wheels turning. The tech firms feel entitled to it, and the state gets plenty of benefit by letting them run wild with it. Large-scale AI requires all of the data to function, and makes protection of it impossible. What if your data are needed for the AI to answer my question? What if your intellectual property is needed to provide my solution? If the AI is restricted in what it can access and use, its ability to answer questions is hampered. If it is unrestricted, it's ability to damage privacy and rights is unencumbered. Privacy is antithetical to large-scale operations like those now entering the general-purpose market, but those operations are highly desirable to both government and industry.
Something tells me that the AI companies and the governments that regulate them will err on the side of "give it everything without restriction." The coin of the realm is data, after all, and money talks.
posted by Open Blogger at
11:00 AM
|
Access Comments