First draft

2018-08-05 01:55:11 -04:00 · 2018-08-05 01:55:11 -04:00 · 62ed114a1b
parent 5e0734e5a7
commit 62ed114a1b
1 changed files with 375 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -1,2 +1,376 @@
-# paper
+**Table of Contents**

+[TOC]
+
+*This is a work-in-progress, please check back later for the final draft*
+
+# Preface
+&lt; Placeholder &gt;
+
+# Overview
+With the rise of technology, the world around us has been constantly changing. 
+However, one thing has remained predominantly the same throughout all these 
+years: how we prove we are who we say we are over the wire. If you have ever 
+created an account to utilize an online service, you had to provide a username 
+and/or email along with a password. Maybe that service already had the username 
+you wanted, the password you choose did not meet the seemingly arbitrary rules 
+they put forth, or already had your email or username registered and you've now 
+realized you've forgotten your password. If you have never not enjoyed any part 
+of this process, you are not alone and this manifesto was written for you and 
+others who feel the same way and want something better.
+
+# The State Of User Authentication
+For every online service that allows for individuals to create accounts to 
+become users in their system, there must be a system in place to differentiate 
+one user from another (identification) and to prevent users from logging on as 
+others (authentication). This is typically handled by using a username for 
+identification and a password for authentication. In most cases, usernames and 
+passwords are decided by the user and they must abide by the parameters decided 
+by the service (typically to prevent duplicate usernames and simple passwords). 
+Many websites will also include an email address as either additional account 
+information or to act in place as the username. This is typically done to 
+mitigate flooding of bot accounts, a means to communicate with the user during 
+password resets, and/or a method to communicate additional information to the 
+user over time.
+
+To allow for authentication to take place, the identification credentials 
+(username) upon logging in must be supplied with the authentication credentials 
+(password) to verify if they match with the pair stored with the online 
+service. In the event that the user cannot login due to forgetting/losing their 
+authentication credentials, the user's account can be recovered either through 
+following a prompt through the user's email after submitting a request for a 
+reset or by answering security questions. Security questions are supposed to be 
+questions answered previously where the answers should only be known by the 
+user to prevent identity theft. Once the user is authenticated, a cookie or 
+token is typically assigned to the user so that they can continue to navigate 
+through the user-only pages of the service during their session without having 
+to consistently authenticate themselves.
+
+# The Problem
+Online services fill a number of roles ranging from entertainment, where we 
+network and socialize, a place to purchase goods and/or services, and a 
+resource to manage our assets. In short, it is sufficiently important that 
+these services be secure and prohibit anyone aside from the owner of the 
+account to gain access to the account. However, the current status quo of how 
+online accounts are setup contain many systemic points of failure.
+
+## Passwords
+The intent behind a password is to allow the owner of the 
+account to gain access while preventing everyone from getting in. This places 
+the password as the front line of defense against anyone who would try to break 
+into an account, so it should be the strongest part of this system to prevent 
+would-be attacker. However, the strength and integrity of passwords are created 
+by users that do not know the proper security measures that should be taken for 
+creating and managing passwords. This leads to situations where many users will 
+choose the same easy to guess password[^1] or using passwords that are easy to 
+crack through brute-forcing by current systems (this includes 8 character, 
+mixed-case, alphanumeric + symbols passwords). 
+
+The trouble here is that given the complexity needed to create sufficient 
+passwords, it leaves for passwords that are not easy to remember. Which leads 
+to the second big problem with passwords, re-using passwords across multiple 
+passwords. While nobody wants to try to remember complicated passwords, they 
+also don't want to remember multiple different passwords. Out of laziness, 
+necessity to remember that one good password, or having been locked out of an 
+account prior due to not remembering which password went to it; a large group 
+of users have decided to use the same password on different services. This 
+greatly expands the threat vector of having one compromised account leading to 
+others accounts sharing the same credentials becoming compromised as well.
+
+## Email
+Following the issue with password where users will share the same password 
+across different services, using the same email creates a similar issue. The 
+purpose email is supposed to serve is a means to electronically send messages 
+to another individual in a fashion similar to snail mail. However, online 
+services have created a second purpose for email, being a means to verify who a 
+user is without the need for a password. For this reason, if a password needs 
+to be reset, the de-facto way of resetting it is by sending a reset link to 
+that user's email so that only they can change the password on their service. 
+
+The inherit problem here is that email uses the same username/password scheme 
+for authenticating their users. So if a user uses an easy to crack or guessable 
+password, the attacker can now access all of their online services that share 
+the same email. Like how many individuals only have one mailing address, most 
+users only have one email for the same reason. Because of the nature of how 
+email works, if an attacker did breach a user's email they wouldn't even need 
+to put much effort into discovering which online services the user is signed up 
+for because they probably got an email from the service upon signing up that is 
+still in their inbox where it can be queried by the attacker. 
+
+While the first problem with email is that it creates a single point of failure 
+across all of the user's online services, there are other downfalls with email 
+in the current system of user authentication. If the user makes the password 
+for their email the same password they use on all their other services, an 
+entirely new problem opens up. Now, not only does that one email allow an 
+attacker to compromise all of user's services, but now if any service becomes 
+compromised it would directly allow the user's email to become compromised as 
+well. In this situation, which is by no means an uncommon situation for the 
+average user, every online service is a single point of failure that can cause 
+a rippling effect that leads to all their other services becoming compromised 
+as well.
+
+## Service Providers
+While users cannot be relied on to know the proper security measures to take, 
+surely the developers and engineers running the online services we use would be 
+able help mitigate against user's accounts becoming compromised? Unfortunately, 
+the barrier of entry to create a login system for an online service does not 
+include being taught the proper techniques for handling and storing a user's 
+login credentials. Every year you can see multiple reports of various services 
+(even the big names that you'd expect to know better) will have their system 
+breached and discover that the passwords were stored as plain text[^2], leaving 
+user's accounts credentials immediately compromised. The alternative is that 
+the passwords would be hashed[^3] so attackers would not instantly have their 
+hands on the user's password. However, if the passwords are simply hashed 
+without any other precautionary steps taken, an attacker can still use a 
+rainbow table[^4] to derive the passwords from their hashes without the need 
+for brute-forcing the password.
+
+If the above didn't all make sense, that's okay. The main take away here is 
+that users should not trust the service providers to adequately handle their 
+account information in the event of any security breach on their end.
+
+## User Adoption
+The previous three problems are but a few in a long list of systemic issues 
+with the current model of user authentication. While the initial thought may be 
+"just create a new system", it's not that easy. Part of the reason all services 
+use the same email/username/password model is because that's what users are 
+used to and expect when they go to sign up for an online service. There have 
+been attempts at solving the above problems by creating password 
+managers[^5][^6][^7] to create and remember your passwords; using two-factor 
+authentication, aka 2FA, through software[^8] or hardware[^9]; and using other 
+sites to handle user authentication[^10]. However, these solutions are either 
+not the defaults, require too much extra effort for the users, and/or are 
+completely unknown to the user. For something to come in to solve the issue of 
+user authentication at a meaningful scale it would have to solve the problems 
+above (and more), require as more or left effort to the user, be intuitive / 
+easy to grasp for the user (ranging from children to elders), and be adoptable 
+by service providers to act as the new default or alternative for the current 
+model.
+
+# Zccount
+After thinking over these issues with user authentication and among many others 
+that I've come to realize later on, I wondered if there was a way to fix this. 
+However, before I go into explaining what Zccount is, I feel it is important to 
+discuss what it isn't and doesn't try to solve before moving forward.
+
+## What It Is Not
+User authentication is utilized in many different mediums, but I feel it's 
+important to make a clear distinction about what use cases Zccount is not meant 
+to apply to. To put simply, all instances of user authentication for accessing 
+hardware. While the username / password scheme does prevent most users from 
+accessing your device, it is much like the front door of one's house: keeps the 
+hoi polloi from passing over the threshold of your place of inhabitance, but if 
+anyone were to truly desire getting in there's little in the way to stop them. 
+The simplest manner is if the drive is not encrypted, it can be mounted and 
+read from another machine without needing to ever load the OS. There are other 
+means to prevent against this (as mentioned, encryption is a good place to 
+start), but this is both beyond the scope of this paper and outside the field 
+of user authentication that Zccount tries to solve.
+
+As well, Zccount does not and will not make any attempt to strengthen the 
+security of one's service. While it tries to minimize the impact of one's 
+service being breached, it alone will not prohibit nor assist in the prevention 
+of the breach of services; Zccount strictly tries to further the default 
+security granted to the users.
+
+## Core Principles
+Keeping in mind what Zccount is not, I'll go over what Zccount is (or will try 
+to be) by first going over it's philosophies and rules it most adhere to.
+
+### Trustless
+One of the core beliefs of Zccount, and also what I believe to be the greatest 
+issue with user authentication is the fact that one's password is sent "over 
+the wire" to the server to be used for authentication. This both allows for 
+man-in-the-middle[^11] (MITM) attacks to occur upon each authentication as well 
+as requiring trust in the the service storing your password to keep your 
+password safe (whether this be intentionally handing it off to a third-party or 
+having them compromised through a security breath). For this reason, it is 
+essential that passwords never leave the client.
+
+The key to allowing authentication without exposing one's password is through 
+zero-knowledge proofs[^12] (ZKP), a method that allows one to prove to another 
+party that they have a password without exposing it to the other party through 
+the means of statistics, probability, and other mathematical processes (for an 
+example of such implementations, there's password-authenticated key 
+agreement[^14]). The manner ZKP's will be utilized in Zccount will be explained 
+later on.
+
+### Diversity
+As I mentioned above, one failure in the username / password scheme is that if 
+one service is compromised, your password can then be used for accounts sharing 
+the same password. To prevent these type of attacks, Zccount assigns a 
+different set of credentials for each service so that if any one service is 
+compromised it will not risk the integrity of the other services, keeping it 
+isolated from the others. This includes both the identifier for the user as 
+well as the verifier for the user
+This also provides an added layer of security in that 
+through the credentials alone, services cannot track a user between services 
+(where this is not the case with email). 
+
+### Consent
+Another way the current methods of user authentication falls short for it's 
+users is that by default anyone can use someone's credentials from any device 
+(including one that does not belong to the user). While at face value this may 
+not look to be an issue, but rather a feature since it allows anyone to access 
+their services from any device, but this should require some action by the user 
+first beforehand. I will grant that some services get this right by either 
+informing users when their account has been used a new machine, or even better 
+does not allow users to access the account on a new machine until authorized by 
+the user (however, this is unfortunately at times done via email, which as 
+stated above is far from ideal). The way Zccount currently proposes to handle 
+this is to either derive a reproducible unique identifier for each machine or 
+if unable to, one will be generated to serve the same purpose (and as per 
+adhering to the philosophy of diversity, this machine identifier will be 
+altered for each service). For security reasons, the user should be able to 
+request a listing of the devices associated with a service so that they can be 
+revoked by the user should they expect any of them to become compromised, along 
+with allowing the machine identifier itself to be regenerated.
+
+The proposed method by Zccount for how a user will consents to a login on a new 
+device through a synchronizing process where a device with the account logged 
+in (presumably a mobile device in most cases) will begin generating time-based 
+one-time passwords[^15] (TOTPs) to be entered into onto the other device 
+correctly X times in a row (where X will either be standardized or determined 
+by the vendor) to prove that the user is in fact the one logging into the 
+account. The also provides the added security should their accounts become 
+compromised since it would inhibit the attacker for logging in on a new device.
+
+### Accessibility
+I am not referring to Zccount's accessibility to the disabled here (which is 
+definitely important, but due to it's core principle of being **Automatic** it 
+already is), but rather how accessible the Zccount system is to everyone 
+currently using the username / password scheme. Ideally, Zccount should be 
+usable to anyone that is currently able to use a username and password to log 
+in to their accounts. While this may appear obvious, many of the recent systems 
+that try to tackle the user authentication problem have been heavily relying on 
+either the existence of biometrics or audio / visual inputs (e.g. cameras and 
+microphones). These technologies work great for 2FA, due to the fact that only 
+recent devices have these features, others like older laptops and a plurality 
+of desktops would be left unable to use Zccount with adding the appropriate 
+hardware (and also adds in an unneeded layer of breaching the privacy of the 
+user by requiring they provide their face, voice, fingerprints, etc to identify 
+themselves to a service). For this reason, any implementation or proposal for 
+Zccount should assume the least amount of hardware possible so the maximum 
+audience of users is reachable.
+
+### Automatic
+This principle is not so needed from a security standpoint, but rather from the 
+perspective of user experience and adaptability which is crucial in ensuring 
+the success of Zccount being used in the "real world". The method Zccount 
+proposes to allow for automated user authentication is by generating the 
+identifier and verifier (which is analogous to the username and password in the 
+old model). When an account is made with Zccount, the user identifier, 
+verifier, and the machine identifier triad will be generated and will again be 
+done for each service. The way Zccount proposes to solve the issue of 
+authenticating with the server is to send it's user and machine identifier to 
+the server over a secure connection (i.e. HTTPS and not HTTP) and request the 
+server to generate a verifier on it's end for the account along with an 
+artificial neural network[^16] (ANN). 
+
+Taking a brief tangent away from the authentication process, I want to explain 
+why ANNs are used here. While ANN's are not something associated with user 
+authentication, what they are when taken outside of the context of machine 
+learning is a function that approximates functions, e.g. when given a set of 
+inputs it creates an approximate function that creates the expected output. A 
+problem of ANNs in the machine learning space is that when applied continuously 
+on the same set of inputs, it suffer from overfitting[^17] where it can supply 
+the correct output for the inputs used for training, but for all other inputs 
+the function doesn't work as expected (I won't explain why this is here since 
+it is outside the scope of this paper so just take my word for it here). 
+However, in this case we're going to abuse the properties of overfitting to 
+suit our needs. We are going to take the ANN provided by the server (which can 
+and should be a different / random architecture for each user) and use our 
+local verifier as the input to train the ANN to output the server's verifier. 
+Once trained, the client will convert store the architecture and weights of the 
+ANN to send to the server and store locally. To maximize security, multiple 
+different ANNs can be sent to the user to train against (and even better, 
+supply a different verifier for each ANN), so the server can randomly send a 
+network to the user to test against for authentication. This way the server 
+never sees the user's verifier, yet can verify the authenticity of the user.
+
+## Desired Features
+While Zccount should not deviate from it's core principles, there are 
+non-essentials features it ought to have to fully replace the niceties that 
+users have come to expect from the current user authentication model. As of 
+right now, the core principles of Zccount allow for the subjective user 
+experience of visiting to be automatically having an account created where the 
+account is bound to the user's machine and using a near-impossible verifier to 
+brute-force along with an identifier that cannot be used to track the user 
+across different sites, all without any action from the user. The only thing 
+the service would need now for the account is application specific information 
+about the user. While this is definitely an improve from the default security a 
+plurality of users begin with, there's some features that may be desired.
+
+### Recoverability
+In the current email / username / password scheme a user can reset their 
+password for their account when any combination of the three are lost or 
+forgotten by either sending an email, asking for personal information only 
+the user would know that was collected prior (e.g. phone number, social 
+security number, etc), or answering security questions. Each of these 
+approaches has their own faults (most of which described above). While Zccount 
+can cope with this in a situation where the user has logged into at least one 
+other device prior and has access to it (via re-syncing the account as 
+described in **Consent**), but there's no good and secure way to do this when 
+only one device is associated with the user's account. This can be remedied by 
+requesting the server generate a recovery key (similar to the backup codes 
+provided through 2FA) prior and to supply them to prove authenticity (as a side 
+note, these keys should be able to be revoked and regenerated at any point if 
+it is expected they have been compromised). While in keeping up with the "more 
+security by default" philosophy it may seem odd that recoverability is not 
+something provided by default, it is in fact more secure to have recoverability 
+as an opt-in feature rather than an opt-out. While recoverability is convenient 
+it opens another security hole in the account that could be exploited and not 
+including it renders attackers just as powerless to enter the account as the 
+user if the credentials are lost. However, losing credentials should be nowhere 
+near as easy with Zccount because they are generated automatically by Zccount 
+where no action is issued by the user, so as long as Zccount is working on the 
+user's device the credentials should be there as well.
+
+# Design Flaws
+While the current proposals for Zccount eliminate many of issues users face 
+in the user authentication process, it is not without it's own drawbacks and 
+flaws. This is hopefully where others can come in to suggest amendments to the 
+Zccount proposal to rectify these issues.
+
+## Viruses / Malware
+No device is every completely safe from viruses and malware. The current model 
+is vulnerable to attacks where either of these two can attempt to read and/or 
+corrupt the locally stored values on the user's machine for Zccount. The worst 
+case scenarios here is that the attacker finds a way to reverse-engineer which 
+credentials match up to which service and attempt to use them to hijack a 
+user's account or by corrupting the data it can lock a user out of their 
+accounts (which can be minimized by using methods described in **Consent** and
+**Recoverability**). While other measures should be taken to prevent this and 
+it is not directly within the scope of problems Zccount tries to solve, the 
+side-effect is drastic enough that prevention measures should be considered.
+
+# Conclusion
+Users deserve a better user experience than the one provided to them by online 
+services. The current user authentication system adheres to a "blame the 
+consumer" ideology where if a user is not knowledgeable of the actions needed 
+to be taken on their part to ensure security, they are left unprotected and 
+defenseless against attackers while under the belief their assets are secure. 
+We have the available technology to provide a better solution, regardless of 
+whether that solution is Zccount or something else, so let's make a change and 
+provide users the security they deserve.
+
+- - -
+
+# References
+[^1]: <https://en.wikipedia.org/wiki/List_of_the_most_common_passwords>
+[^2]: <https://en.wikipedia.org/wiki/Plain_text>
+[^3]: <https://en.wikipedia.org/wiki/Cryptographic_hash_function>
+[^4]: <https://en.wikipedia.org/wiki/Rainbow_table>
+[^5]: <https://bitwarden.com/>
+[^6]: <https://1password.com/>
+[^7]: <https://www.lastpass.com/>
+[^8]: <https://authy.com/>
+[^9]: <https://www.yubico.com/>
+[^10]: <https://en.wikipedia.org/wiki/OAuth>
+[^11]: <https://en.wikipedia.org/wiki/Man-in-the-middle_attack>
+[^12]: <https://en.wikipedia.org/wiki/Zero-knowledge_proof>
+[^13]: <https://en.wikipedia.org/wiki/Password-authenticated_key_agreement>
+[^14]: <https://en.wikipedia.org/wiki/Password-authenticated_key_agreement>
+[^15]: <https://en.wikipedia.org/wiki/Time-based_One-time_Password_algorithm>
+[^16]: <https://en.wikipedia.org/wiki/Artificial_neural_network>
+[^17]: <https://en.wikipedia.org/wiki/Overfitting>