Identity Solutions for the Internet: Part 1

Felix Hildebrandt
KEEZdao
Published in
8 min readFeb 14, 2022

--

Many thanks to Rob G, Hugo Masclet, and Callum Grindle, who assisted me in polishing this article. I am also very grateful for the collaboration with KEEZ to bring more attention to the topic.

The mapping of identities, digital assets, and online profiles has recently gained significant traction in the blockchain industry. New technology is forming architectures that will further pave the way for decentralized and user-centered mechanisms. The article will discuss the following:

  1. How the internet has evolved in managing identities
  2. What problems arose for identity management
  3. How new architectures help build applications on top of privacy rights

To determine the privacy, accounting, and user-data guidelines needed for the upcoming branch of decentralized services, we consider current technological and ethical perspectives and lessons learned from past identity systems.

Identity within the typical internet

The internet is a global mesh of servers that use protocols like TCP and IP to focus on data transfers between specific device addresses. Individuals can transfer data onto servers, and regular private computers can connect and load that data. As the internet appeared in the late 1980s, web pages could be described as windows into a new world. By today's standards, those web pages are primitive. They lacked user management, were read-only, and mainly used to share university knowledge. At this time, communication was still more or less done via phone or mail, but email use began to grow. Society quickly embraced the internet and, when the interaction between users and the internet grew, a new era of online connectivity emerged, known today as Web2. It was a front-end revolution with new browser functionalities, leaving server-centered structures and databases (the backend) unchanged.

As the internet matured, the demand for web analytics increased. Due to the technological limitations of the time, it was only possible for administrators to determine how many devices visited certain pages and at what time they viewed the content. Extended tracking of interactions was impossible, which led to new technology giving access to the user information about the individual in front of the device.

IT security and backup systems increased drastically to manage the throughput and safety of what became the most valuable goods: user data. As fraud increased, companies built large server centers to secure user files from unauthorized access. Engineers developed cookies and APIs to track user behavior within sessions. Services could even store traffic data or user information within the browser. Behavior analysis became so popular that it evolved into a standalone business. Collecting data about users' cart content, interest, browsing history, and previously viewed advertisements was essential to improving sales. Companies gained significant insight about the person behind the devices with that information. New use cases like social media, e-commerce, and even interactive knowledge platforms proliferated. For example, Facebook and Google encouraged a digital social life, knowledge platforms like Wikipedia had a high increase in content, and Amazon claimed the retail market. A vast need for detailed user data emerged. What started as optimizing profits by tracking users evolved into directly gaining profit from personal information. In summary, data analysis has become an essential factor in how digital products gain value.

Currently, identity on the web primarily consists of multiple user accounts created for almost every software product or service being used. When a device logs in, access is granted to the information contained within the account. In this relationship, the service provider is the account's custodian and completely controls all its data. Users can log in directly from the service provider or by linking to existing logins from other providers. The second method, which allows logins to multiple services using one main account, was developed by IT giants with billions of users, including Google, Facebook, and Microsoft. This method increases convenience and risks losing access to every account linked to the provider. If the password is lost, the account got compromised, or the service is down, that could be the case.

Regular Web2 Login Scheme
Regular Web2 Login Scheme

New methods for securing authentication have evolved, 2FA and OAuth 2.0 being two current standards. 2FA adds a layer of security to the traditional username and password method by requiring users to present additional proof of authenticity, such as a code from an authenticator app. OAuth brings more security to transmitting credentials linked to other services and gives users control over where their data is shared. However, the man-in-the-middle principle remains: the intermediate provider can always surveil users' interactions with the accounts linked to their services. Linked logins provide a colossal danger for privacy concerns and are susceptible to attacks that can affect all connected services at once.

Fundamentally, the challenges faced for creating digital identity solutions can be traced back to the internet's architecture. It was designed around machines with unique device addresses, not individuals with unique identities. There are no built-in systems to verify identity, only systems that can prove devices. Initially, the internet was a read-only source of information. Developers created username and password authentication methods in the Web2 era when the internet became interactive. These methods are built on top of the original device-based architecture, which is prone to data manipulation and interception. The non-existence of a sophisticated identity layer within the internet is one of the primary sources of cybercrime and identity theft. This global threat causes enormous financial and personal damage. IBM President Ginni Rometty describes identity theft as "the greatest threat to every profession, every industry, every company in the world."

Web2 lacks the technology for creating unique digital values that we can sign. The signature on a document represents your identity in the real world, but the files we transfer within the internet are just copies. Commonly, we scan them to mimic credentials used repeatedly on the internet, but providing a screenshot of a passport or certificate of enrollment leaves a lot of room for crimes. With so many services holding various pieces of personal data, it is easy to lose track of who owns or uses it or if it is even up-to-date. In this situation, users must place immense trust in the service providers that hold their data and identity information.

Personal data is stored on servers operated by companies. Even with regulations that give users control over their data, the information still technically belongs to them. Users are gaining the right to manage the data being collected about them, but that does not stop companies from processing that already collected. How quickly companies can analyze data for the desired advantage is simply a matter of computational power, which grows exponentially.

Implementation of Data and Security Laws

Ethical questions have also been raised concerning collecting personal data, which led to the General Data Protection Regulation of the European Union in 2018. The GDPR concluded that "everything that helps identify a person, regardless of whether it refers to a person's professional, private, or public life" counts as personal data.

The ethical classification is based on this definition and should ensure that users have the exclusive right to manage their data, whether sensitive information or not. User data is collected in any case, and it is tough to limit. Governments should not restrict the data collection, but citizens should have full and transparent access to the collected personal data.

The General Data Protection Regulation is used to protect data collected from the citizens of the European Union. Data sovereignty must be a fundamental right that all companies guarantee. It should apply to all citizens, constitutions, and businesses within the European Union. The goals of the GDPR are to protect the fundamental rights and freedoms of natural persons in the processing and free movement of their data. To protect citizens, companies need to clearly define the specific personal data stored and which processing methods are used.

Companies often have enough user data to form exact personality models. The GDPR limits how this data may be sourced, but companies can creatively manipulate data to enable legal loopholes. Also, innovation in data processing allows for more and more information to be extracted from legally sourced data, which counteracts the effectiveness of regulations.

In the future, companies will have to adapt to new regulations continuously. Over time, users will gain more rights to erase data and look up where and when companies stored the data. Higher fines, and the obligation to notify users in the event of infringements, will follow. Regulations must be extraterritorial, meaning they must apply to operate servers outside the EU. The essential consideration is where the data is from and where the data flows. Certificates that verify certain services and products fulfill governments also discussed the GDPR standards. Still, verifications may be an obstacle as the EU must perform periodic evaluations that check on the code and algorithms used within digital services. Implementing systems to oversee the compliance of regulations is an enormous task and could result in a lengthy restructuring of digital ecosystems. Every business, healthcare system, government entity, e-commerce platform, and future IoT device require identity management. Rethinking and changing how data is stored and managed, from small companies to IT giants, is highly relevant.

As defined within the GDPR, companies must comply with the data protection rights shown in their checklist. The identity infrastructure is expensive. Many companies are still caught up in data ownership lawsuits, ambiguous data sales, and the gray areas of user behavior prediction. As users' rights to manage personal data continue to grow, this demand for compliance will continue to increase. Users already have the right to prevent collecting specific data and force its deletion. The GDPR's definition of user data clearly defines that the data collected is owned by the users and that those users can allow access if they wish to do so. Despite the existing Data Protection Regulation, not all companies fully adhere to the established rules or find it nearly impossible due to the cumbersome navigation of those rules. If companies provide more transparency, users will gain more objections regarding personal data and user profiles. It also discloses the sources and origin of the data. These aspects can limit the quality of Big Data processes if users deny the gathering of specific data streams. However, it is a step in the right direction towards gaining fair interaction and complying with human rights when we must remain in a centralized system. Companies will still hold the user data and control identities, but users will get more management rights. These improvements can simultaneously strengthen customer loyalty and the significance of data analysis.

Now that current account management and user data laws have been discussed, we can consider new identity management approaches in the second part.

Identity Solutions for the Internet: Part 2

--

--

Felix Hildebrandt
KEEZdao

Web3 Software Engineer at LUKSO, focusing on dApps, nodes, and community.