First draft

This commit is contained in:
aewens 2018-08-05 01:55:11 -04:00
parent 5e0734e5a7
commit 62ed114a1b
1 changed files with 375 additions and 1 deletions

376
README.md
View File

@ -1,2 +1,376 @@
# paper
**Table of Contents**
[TOC]
*This is a work-in-progress, please check back later for the final draft*
# Preface
< Placeholder >
# Overview
With the rise of technology, the world around us has been constantly changing.
However, one thing has remained predominantly the same throughout all these
years: how we prove we are who we say we are over the wire. If you have ever
created an account to utilize an online service, you had to provide a username
and/or email along with a password. Maybe that service already had the username
you wanted, the password you choose did not meet the seemingly arbitrary rules
they put forth, or already had your email or username registered and you've now
realized you've forgotten your password. If you have never not enjoyed any part
of this process, you are not alone and this manifesto was written for you and
others who feel the same way and want something better.
# The State Of User Authentication
For every online service that allows for individuals to create accounts to
become users in their system, there must be a system in place to differentiate
one user from another (identification) and to prevent users from logging on as
others (authentication). This is typically handled by using a username for
identification and a password for authentication. In most cases, usernames and
passwords are decided by the user and they must abide by the parameters decided
by the service (typically to prevent duplicate usernames and simple passwords).
Many websites will also include an email address as either additional account
information or to act in place as the username. This is typically done to
mitigate flooding of bot accounts, a means to communicate with the user during
password resets, and/or a method to communicate additional information to the
user over time.
To allow for authentication to take place, the identification credentials
(username) upon logging in must be supplied with the authentication credentials
(password) to verify if they match with the pair stored with the online
service. In the event that the user cannot login due to forgetting/losing their
authentication credentials, the user's account can be recovered either through
following a prompt through the user's email after submitting a request for a
reset or by answering security questions. Security questions are supposed to be
questions answered previously where the answers should only be known by the
user to prevent identity theft. Once the user is authenticated, a cookie or
token is typically assigned to the user so that they can continue to navigate
through the user-only pages of the service during their session without having
to consistently authenticate themselves.
# The Problem
Online services fill a number of roles ranging from entertainment, where we
network and socialize, a place to purchase goods and/or services, and a
resource to manage our assets. In short, it is sufficiently important that
these services be secure and prohibit anyone aside from the owner of the
account to gain access to the account. However, the current status quo of how
online accounts are setup contain many systemic points of failure.
## Passwords
The intent behind a password is to allow the owner of the
account to gain access while preventing everyone from getting in. This places
the password as the front line of defense against anyone who would try to break
into an account, so it should be the strongest part of this system to prevent
would-be attacker. However, the strength and integrity of passwords are created
by users that do not know the proper security measures that should be taken for
creating and managing passwords. This leads to situations where many users will
choose the same easy to guess password[^1] or using passwords that are easy to
crack through brute-forcing by current systems (this includes 8 character,
mixed-case, alphanumeric + symbols passwords).
The trouble here is that given the complexity needed to create sufficient
passwords, it leaves for passwords that are not easy to remember. Which leads
to the second big problem with passwords, re-using passwords across multiple
passwords. While nobody wants to try to remember complicated passwords, they
also don't want to remember multiple different passwords. Out of laziness,
necessity to remember that one good password, or having been locked out of an
account prior due to not remembering which password went to it; a large group
of users have decided to use the same password on different services. This
greatly expands the threat vector of having one compromised account leading to
others accounts sharing the same credentials becoming compromised as well.
## Email
Following the issue with password where users will share the same password
across different services, using the same email creates a similar issue. The
purpose email is supposed to serve is a means to electronically send messages
to another individual in a fashion similar to snail mail. However, online
services have created a second purpose for email, being a means to verify who a
user is without the need for a password. For this reason, if a password needs
to be reset, the de-facto way of resetting it is by sending a reset link to
that user's email so that only they can change the password on their service.
The inherit problem here is that email uses the same username/password scheme
for authenticating their users. So if a user uses an easy to crack or guessable
password, the attacker can now access all of their online services that share
the same email. Like how many individuals only have one mailing address, most
users only have one email for the same reason. Because of the nature of how
email works, if an attacker did breach a user's email they wouldn't even need
to put much effort into discovering which online services the user is signed up
for because they probably got an email from the service upon signing up that is
still in their inbox where it can be queried by the attacker.
While the first problem with email is that it creates a single point of failure
across all of the user's online services, there are other downfalls with email
in the current system of user authentication. If the user makes the password
for their email the same password they use on all their other services, an
entirely new problem opens up. Now, not only does that one email allow an
attacker to compromise all of user's services, but now if any service becomes
compromised it would directly allow the user's email to become compromised as
well. In this situation, which is by no means an uncommon situation for the
average user, every online service is a single point of failure that can cause
a rippling effect that leads to all their other services becoming compromised
as well.
## Service Providers
While users cannot be relied on to know the proper security measures to take,
surely the developers and engineers running the online services we use would be
able help mitigate against user's accounts becoming compromised? Unfortunately,
the barrier of entry to create a login system for an online service does not
include being taught the proper techniques for handling and storing a user's
login credentials. Every year you can see multiple reports of various services
(even the big names that you'd expect to know better) will have their system
breached and discover that the passwords were stored as plain text[^2], leaving
user's accounts credentials immediately compromised. The alternative is that
the passwords would be hashed[^3] so attackers would not instantly have their
hands on the user's password. However, if the passwords are simply hashed
without any other precautionary steps taken, an attacker can still use a
rainbow table[^4] to derive the passwords from their hashes without the need
for brute-forcing the password.
If the above didn't all make sense, that's okay. The main take away here is
that users should not trust the service providers to adequately handle their
account information in the event of any security breach on their end.
## User Adoption
The previous three problems are but a few in a long list of systemic issues
with the current model of user authentication. While the initial thought may be
"just create a new system", it's not that easy. Part of the reason all services
use the same email/username/password model is because that's what users are
used to and expect when they go to sign up for an online service. There have
been attempts at solving the above problems by creating password
managers[^5][^6][^7] to create and remember your passwords; using two-factor
authentication, aka 2FA, through software[^8] or hardware[^9]; and using other
sites to handle user authentication[^10]. However, these solutions are either
not the defaults, require too much extra effort for the users, and/or are
completely unknown to the user. For something to come in to solve the issue of
user authentication at a meaningful scale it would have to solve the problems
above (and more), require as more or left effort to the user, be intuitive /
easy to grasp for the user (ranging from children to elders), and be adoptable
by service providers to act as the new default or alternative for the current
model.
# Zccount
After thinking over these issues with user authentication and among many others
that I've come to realize later on, I wondered if there was a way to fix this.
However, before I go into explaining what Zccount is, I feel it is important to
discuss what it isn't and doesn't try to solve before moving forward.
## What It Is Not
User authentication is utilized in many different mediums, but I feel it's
important to make a clear distinction about what use cases Zccount is not meant
to apply to. To put simply, all instances of user authentication for accessing
hardware. While the username / password scheme does prevent most users from
accessing your device, it is much like the front door of one's house: keeps the
hoi polloi from passing over the threshold of your place of inhabitance, but if
anyone were to truly desire getting in there's little in the way to stop them.
The simplest manner is if the drive is not encrypted, it can be mounted and
read from another machine without needing to ever load the OS. There are other
means to prevent against this (as mentioned, encryption is a good place to
start), but this is both beyond the scope of this paper and outside the field
of user authentication that Zccount tries to solve.
As well, Zccount does not and will not make any attempt to strengthen the
security of one's service. While it tries to minimize the impact of one's
service being breached, it alone will not prohibit nor assist in the prevention
of the breach of services; Zccount strictly tries to further the default
security granted to the users.
## Core Principles
Keeping in mind what Zccount is not, I'll go over what Zccount is (or will try
to be) by first going over it's philosophies and rules it most adhere to.
### Trustless
One of the core beliefs of Zccount, and also what I believe to be the greatest
issue with user authentication is the fact that one's password is sent "over
the wire" to the server to be used for authentication. This both allows for
man-in-the-middle[^11] (MITM) attacks to occur upon each authentication as well
as requiring trust in the the service storing your password to keep your
password safe (whether this be intentionally handing it off to a third-party or
having them compromised through a security breath). For this reason, it is
essential that passwords never leave the client.
The key to allowing authentication without exposing one's password is through
zero-knowledge proofs[^12] (ZKP), a method that allows one to prove to another
party that they have a password without exposing it to the other party through
the means of statistics, probability, and other mathematical processes (for an
example of such implementations, there's password-authenticated key
agreement[^14]). The manner ZKP's will be utilized in Zccount will be explained
later on.
### Diversity
As I mentioned above, one failure in the username / password scheme is that if
one service is compromised, your password can then be used for accounts sharing
the same password. To prevent these type of attacks, Zccount assigns a
different set of credentials for each service so that if any one service is
compromised it will not risk the integrity of the other services, keeping it
isolated from the others. This includes both the identifier for the user as
well as the verifier for the user
This also provides an added layer of security in that
through the credentials alone, services cannot track a user between services
(where this is not the case with email).
### Consent
Another way the current methods of user authentication falls short for it's
users is that by default anyone can use someone's credentials from any device
(including one that does not belong to the user). While at face value this may
not look to be an issue, but rather a feature since it allows anyone to access
their services from any device, but this should require some action by the user
first beforehand. I will grant that some services get this right by either
informing users when their account has been used a new machine, or even better
does not allow users to access the account on a new machine until authorized by
the user (however, this is unfortunately at times done via email, which as
stated above is far from ideal). The way Zccount currently proposes to handle
this is to either derive a reproducible unique identifier for each machine or
if unable to, one will be generated to serve the same purpose (and as per
adhering to the philosophy of diversity, this machine identifier will be
altered for each service). For security reasons, the user should be able to
request a listing of the devices associated with a service so that they can be
revoked by the user should they expect any of them to become compromised, along
with allowing the machine identifier itself to be regenerated.
The proposed method by Zccount for how a user will consents to a login on a new
device through a synchronizing process where a device with the account logged
in (presumably a mobile device in most cases) will begin generating time-based
one-time passwords[^15] (TOTPs) to be entered into onto the other device
correctly X times in a row (where X will either be standardized or determined
by the vendor) to prove that the user is in fact the one logging into the
account. The also provides the added security should their accounts become
compromised since it would inhibit the attacker for logging in on a new device.
### Accessibility
I am not referring to Zccount's accessibility to the disabled here (which is
definitely important, but due to it's core principle of being **Automatic** it
already is), but rather how accessible the Zccount system is to everyone
currently using the username / password scheme. Ideally, Zccount should be
usable to anyone that is currently able to use a username and password to log
in to their accounts. While this may appear obvious, many of the recent systems
that try to tackle the user authentication problem have been heavily relying on
either the existence of biometrics or audio / visual inputs (e.g. cameras and
microphones). These technologies work great for 2FA, due to the fact that only
recent devices have these features, others like older laptops and a plurality
of desktops would be left unable to use Zccount with adding the appropriate
hardware (and also adds in an unneeded layer of breaching the privacy of the
user by requiring they provide their face, voice, fingerprints, etc to identify
themselves to a service). For this reason, any implementation or proposal for
Zccount should assume the least amount of hardware possible so the maximum
audience of users is reachable.
### Automatic
This principle is not so needed from a security standpoint, but rather from the
perspective of user experience and adaptability which is crucial in ensuring
the success of Zccount being used in the "real world". The method Zccount
proposes to allow for automated user authentication is by generating the
identifier and verifier (which is analogous to the username and password in the
old model). When an account is made with Zccount, the user identifier,
verifier, and the machine identifier triad will be generated and will again be
done for each service. The way Zccount proposes to solve the issue of
authenticating with the server is to send it's user and machine identifier to
the server over a secure connection (i.e. HTTPS and not HTTP) and request the
server to generate a verifier on it's end for the account along with an
artificial neural network[^16] (ANN).
Taking a brief tangent away from the authentication process, I want to explain
why ANNs are used here. While ANN's are not something associated with user
authentication, what they are when taken outside of the context of machine
learning is a function that approximates functions, e.g. when given a set of
inputs it creates an approximate function that creates the expected output. A
problem of ANNs in the machine learning space is that when applied continuously
on the same set of inputs, it suffer from overfitting[^17] where it can supply
the correct output for the inputs used for training, but for all other inputs
the function doesn't work as expected (I won't explain why this is here since
it is outside the scope of this paper so just take my word for it here).
However, in this case we're going to abuse the properties of overfitting to
suit our needs. We are going to take the ANN provided by the server (which can
and should be a different / random architecture for each user) and use our
local verifier as the input to train the ANN to output the server's verifier.
Once trained, the client will convert store the architecture and weights of the
ANN to send to the server and store locally. To maximize security, multiple
different ANNs can be sent to the user to train against (and even better,
supply a different verifier for each ANN), so the server can randomly send a
network to the user to test against for authentication. This way the server
never sees the user's verifier, yet can verify the authenticity of the user.
## Desired Features
While Zccount should not deviate from it's core principles, there are
non-essentials features it ought to have to fully replace the niceties that
users have come to expect from the current user authentication model. As of
right now, the core principles of Zccount allow for the subjective user
experience of visiting to be automatically having an account created where the
account is bound to the user's machine and using a near-impossible verifier to
brute-force along with an identifier that cannot be used to track the user
across different sites, all without any action from the user. The only thing
the service would need now for the account is application specific information
about the user. While this is definitely an improve from the default security a
plurality of users begin with, there's some features that may be desired.
### Recoverability
In the current email / username / password scheme a user can reset their
password for their account when any combination of the three are lost or
forgotten by either sending an email, asking for personal information only
the user would know that was collected prior (e.g. phone number, social
security number, etc), or answering security questions. Each of these
approaches has their own faults (most of which described above). While Zccount
can cope with this in a situation where the user has logged into at least one
other device prior and has access to it (via re-syncing the account as
described in **Consent**), but there's no good and secure way to do this when
only one device is associated with the user's account. This can be remedied by
requesting the server generate a recovery key (similar to the backup codes
provided through 2FA) prior and to supply them to prove authenticity (as a side
note, these keys should be able to be revoked and regenerated at any point if
it is expected they have been compromised). While in keeping up with the "more
security by default" philosophy it may seem odd that recoverability is not
something provided by default, it is in fact more secure to have recoverability
as an opt-in feature rather than an opt-out. While recoverability is convenient
it opens another security hole in the account that could be exploited and not
including it renders attackers just as powerless to enter the account as the
user if the credentials are lost. However, losing credentials should be nowhere
near as easy with Zccount because they are generated automatically by Zccount
where no action is issued by the user, so as long as Zccount is working on the
user's device the credentials should be there as well.
# Design Flaws
While the current proposals for Zccount eliminate many of issues users face
in the user authentication process, it is not without it's own drawbacks and
flaws. This is hopefully where others can come in to suggest amendments to the
Zccount proposal to rectify these issues.
## Viruses / Malware
No device is every completely safe from viruses and malware. The current model
is vulnerable to attacks where either of these two can attempt to read and/or
corrupt the locally stored values on the user's machine for Zccount. The worst
case scenarios here is that the attacker finds a way to reverse-engineer which
credentials match up to which service and attempt to use them to hijack a
user's account or by corrupting the data it can lock a user out of their
accounts (which can be minimized by using methods described in **Consent** and
**Recoverability**). While other measures should be taken to prevent this and
it is not directly within the scope of problems Zccount tries to solve, the
side-effect is drastic enough that prevention measures should be considered.
# Conclusion
Users deserve a better user experience than the one provided to them by online
services. The current user authentication system adheres to a "blame the
consumer" ideology where if a user is not knowledgeable of the actions needed
to be taken on their part to ensure security, they are left unprotected and
defenseless against attackers while under the belief their assets are secure.
We have the available technology to provide a better solution, regardless of
whether that solution is Zccount or something else, so let's make a change and
provide users the security they deserve.
- - -
# References
[^1]: <https://en.wikipedia.org/wiki/List_of_the_most_common_passwords>
[^2]: <https://en.wikipedia.org/wiki/Plain_text>
[^3]: <https://en.wikipedia.org/wiki/Cryptographic_hash_function>
[^4]: <https://en.wikipedia.org/wiki/Rainbow_table>
[^5]: <https://bitwarden.com/>
[^6]: <https://1password.com/>
[^7]: <https://www.lastpass.com/>
[^8]: <https://authy.com/>
[^9]: <https://www.yubico.com/>
[^10]: <https://en.wikipedia.org/wiki/OAuth>
[^11]: <https://en.wikipedia.org/wiki/Man-in-the-middle_attack>
[^12]: <https://en.wikipedia.org/wiki/Zero-knowledge_proof>
[^13]: <https://en.wikipedia.org/wiki/Password-authenticated_key_agreement>
[^14]: <https://en.wikipedia.org/wiki/Password-authenticated_key_agreement>
[^15]: <https://en.wikipedia.org/wiki/Time-based_One-time_Password_algorithm>
[^16]: <https://en.wikipedia.org/wiki/Artificial_neural_network>
[^17]: <https://en.wikipedia.org/wiki/Overfitting>