הסברים והדגמה איך להשתמש ב- GitLab CI יחד עם Vault
פורסם ב 12 באוקטובר 2020 ¬ 7:56h.Tamir Gefen
העלנו לכאן סרטון חדש ושימושי, הכולל הסברים והדגמה (דמו). הדמו מציג איך אפשר לחבר בין GitLab CI לבין HashiCorp Vault ע"מ לחשוף את הסודות (secrets) ל- GitLab runners .
הדמו גם מציג איך סודות בסביבת production יכולים להיות מוגנים משימוש לא נכון.
בתחילת הסרטון יש כמה דקות של רקע ומבוא על Vault ועל GitLab CI .
My name is Michael Krakowski. I work as a solutions architect for GitLab.
Today I'd like to tell you why and how would you want to integrate HashiCorp Vault with GitLab.
Let's start with understanding what HashiCorp Vault is from an engineering perspective.
Functionally it is a credential database it implements similar principle to the hardware security modules,
once used in the industry except it does in software,
so you have storage containing the database and this storage can only be booted into the actual service by using an out of m keys available and once it is up and running it can be used to read write update and delete secrets.
As simple as that interaction happens using API either directly or via GUI or the command line client (CLI).
this command line client is the same statically linked binary written in Go which is used to boot a server so as usual with go binaries it can run practically anywhere, and whenever you want to access the service you need to talk to one of the authentication services first in order to obtain a token which will then authorize all the requests to the service.
Now where Vault really gets useful is infrastructure-specific back-ends.
They are modules implementing this functionality for various types of infrastructure which can be used interchangeably for example for AWS.
you can have a back end which defines credential but this credential is not a static one – it's a dynamically generated credential bound to specific AWS role, so every time you request it you get a different physical credential and Vault manages it meaning the credential has a limited time to live. It can be renewed or not it can be destroyed once it is not being used anymore, which limits the time of life of this credential and limits the risk of the actual leak of it.
Now from our perspective what matters from the perspective of integration is that vault database is built as a tree-like structure so you have a configuration three and credentials live in the branches
and the way you manage access to that database is via policies which grant specific operations on specific branches
for example, ability to write to that credential or ability to read that credential or ability to destroy the credential now policies are gathered into roles which are essentially a group of policies and every single authentication request
uh to to to authentication services to generate token happens in the context of a given role so so token impersonates the role and then every request using this token is effectively authorized by vault to either conduct or not the operations requested from the perspective of GitLab the objective is of course to make those voltmeters credentials available to the runner so that you can have environment specific credentials stored in the database, they're not
managed within GitLab but GitLab can still request them and use them as needed the whole authentication concept is based on JSON web token which is one of the authentication mechanisms implemented by vault and is used on the GitLab end json web token is essentially a json document specifying uh who the GitLab instance is
who the user is who on behalf of which the execution happens and the job details requesting that specific access
it is signed by GitLab and it's verified by vault in process authorization in turn happens on the voltro entirely
so by sending the request GitLab requests a specific role from vault and the token in the context of this role is
either granted or not based on a match between the json web token describing the job and the limitations which are set in the role these are defined by so-called bound claims are now pretty important because they can limit the scope of that integration to a particular project or even particular branch within that project we're going to see that
in practice pretty soon part to understand is that those secrets need to be explicitly defined in GitLab ci yamo so runder will not extract every possible secret which is available to the role it will only extract the secrets which are
explicitly requested and declared in the GitLab ciam so when you look at the flow graphically once again it is GitLab which is generate generating jwt sending it to the runner forwards that jwt to volt which
then verifies that was GitLab to make sure that the signature is valid if it is a vault would return the token which was
previously requested by a runner based on the definition and if you look at the example jwt
you would see that it contains basically a personality of the job which user is requesting which pipeline id job id also which um project which specific project is doing this uh what is the job name um what is the branch in which this job is running in the context.
Let's see how this works in practice:
I have set up a sample project which would be accessing credentials involved this project has two environments staging and production credentials are stored in environment-specific branches
protected by environment-specific policies and we will protect the production credentials from hijacking from any
other environments using bound claims.
So let's review first what do we have defined in volt there are two credentials production contains username and
password which is production-specific and staging secret contains the same type of credentials but the password is obviously different here we do have two policies which are symmetrical one is relating to the production environment the other is related to the staging environment as you might remember policies are essentially
granting a certain set of operations for a certain endpoint or branch so here we're talking about the secret sample project staging um and we are granting the capability to read and list and then
we do have two roles.
those roles are production and staging specific and they would be different because they contain a specific bound claims so when you look at the staging role important parameter here is this bound claim it says that this role can be assumed only in the context of a specific project id so other GitLab project would not get access to that particular role, and the policy granted to it it is then essentially granting a sample project staging policy when you look at the production policy it contains even tighter control because it insists not only on having this particular project defined but also it can only be executed if the branch in the context of which the job is being run is a master so now let's
switch back to our system I have a very simple project which only contains a GitLab CI mo file and this GitLab yaml is containing two jobs which are meant to retrieve the secrets with the vault method and then expose them to the console by echoing them to the output now as you can see those jobs are running on every branch except master for the staging job and for the production job we only run this one on master the way those jobs are being assigned roles is defined by variables and when you expand this section you can see that there is a vault authentication
role variable which is defined separately for production and staging with the relevant values that are configuring the
job to actually retrieve request the right role policy and then eventually retrieve the right credentials and when you look at ci cd runs you can see that there is a feature branch pass which is actually showing the staging specific credentials here and for the production branch it would list the production-specific certificate uh production-specific password so now let's see let's assume a user is malicious and would like to hijack production certificate we know that in GitLab you cannot make changes in the master branch uh without proper merge request but uh let's assume our user wants to do that experimentation on a feature branch where he's allowed or she's allowed to do anything they want so going to feature branch I have two jobs and what I'm going to do is within that feature branch I will remove the limitation on the must on the production job to only run on the master branch I'm expecting now both of the jobs to be running on my on my next ci run again i can only commit them to feature branch without additional permission this is what I do and I can see that the pipeline has failed already let's see why this pipeline contains both jobs this is expected it contains a successful staging drop this is expected because we're in staging environment but the production job failed let's see why the reason it is failing is that vault is giving us 400 code it is validating claims and it see that the claim on the master branch is not satisfied so this role cannot be granted which is our way to fence production credentials and defend them against malicious use with this integration you can use vaults
to store your environment-specific credentials externally to GitLab manage them with a HashiCorp Vault and yet use them in the job as you need them on the go