Entries Tagged 'Swift' ↓

Keystone Federated Swift – Multi-region cluster, multiple federation, access same account

Welcome to the final post in the series, it has been a long time coming. If required/requested I’m happy to delve into any of these topics deeper, but I’ll attempt to explain the situation, the best approach to take and how I got a POC working, which I am calling the brittle method. It definitely isn’t the best approach but as it was solely done on the Swift side and as I am a OpenStack Swift dev it was the quickest and easiest for me when preparing for the presentation.

To first understand how we can build a federated environment where we have access to our account no matter where we go, we need to learn about how keystone authentication works from a Swift perspective. Then we can look at how we can solve the problem.

Swift’s Keystoneauth middleware

As mentioned in earlier posts, there isn’t any magic in the way Swift authentication works. Swift is an end-to-end storage solution and so authentication is handled via authentication middlewares. further a single Swift cluster can talk to multiple auth backends, which is where the `reseller_prefix` comes into play. This was the first approach I blogged about in these series.

 

There is nothing magical about how authentication works, keystoneauth has it’s own idiosyncrasies, but in general it simply makes a decision whether this request should be allowed. It makes writing your own simple, and maybe an easily way around the problem. Ie. write an auth middleware to auth directly to your existing company LDAP server or authentication system.

 

To setup keystone authentication, you use keystones authtoken middleware and directly afterwards in the pipeline place the Swift keystone middleware, configuring each of them in the proxy configuration:

pipeline = ... authtoken keystoneauth ... proxy-server

The authtoken middleware

Generally every request to Swift will include a token, unless it’s using tempurl, container-sync or to a container that has global read enabled but you get the point.

As the swift-proxy is a python wsgi app the request first hits the first middleware in the pipeline (left most) and works it’s way to the right. When it hits the authtoken middleware the token in the request will be sent to keystone to be authenticated.

The resulting metadata, ie the user, storage_url, groups, roles etc, and dumped into the request environment and then passed to the next middleware. The keystoneauth middleware.

The keystoneauth middleware

The keystoneauth middleware checks the request environment for the metadata dumped by the authtoken middleware and makes a decision based on that. Things like:

  • If the token was one for one of the reseller_admin roles, then they have access.
  • If the user isn’t a swift user of the account/project the request is for, is there an ACL that will allow it.
  • If the user has a role that identifies them as a swift user/operator of this Swift account then great.

 

When checking to see if the user has access to the given account (Swift account) it needs to know what account the request is for. This is easily determined as it’s defined by the path of the URL your hitting. The URL you send to the Swift proxy is what we call the storage url. And is in the form of:

http(s)://<url of proxy or proxy vip>/v1/<account>/<container>/<object>

The container and object elements are optional as it depends on what your trying to do in Swift. When the keystoneauth middleware is authenticating it’ll check that the project_id (or tenant_id) metadata dumped by authtoken, when this is concatenated with the reseller_prefix, matches the account in the given storage_url. For example let’s say the following metadata was dumped by authtoken:

{
"X_PROJECT_ID": 'abcdefg12345678',
"X_ROLES": "swiftoperator",
...
}

And the reseller_prefix for keystone auth was AUTH_ and we make any member of the swiftoperator role (in keystone) a swift operator (a swift user on the account). Then keystoneauth would allow access if the account in the storage URL matched AUTH_abcdefg12345678.

 

When you authenticate to keystone the object storage endpoint will point not only to the Swift endpoint (the swift proxy or swift proxy load balancer), but it will also include your account. Based on your project_id. More on this soon.

 

Does that make sense? So simply put to use keystoneauth in a multi federated environment, we just need to make sure no matter which keystone we end up using and asking for the swift endpoint always returns the same Swift account name.

And there lies our problem, the keystone object storage endpoint and the metadata authtoken dumps uses the project_id/tenant_id. This isn’t something that is synced or can be passed via federation metadata.

NOTE: This also means that you’d need to use the same reseller_prefix on all keystones in every federated environment. Otherwise the accounts wont match.

 

Keystone Endpoint and Federation Side

When you add an object storage endpoint in keystone, for swift, the url looks something like:

http://swiftproxy:8080/v1/AUTH_$(tenant_id)s

 

Notice the $(tenant_id)s at the end? This is a placeholder that keystone internally will replace with the tenant_id of the project you authenticated as. $(project_id)s can also be used and maps to the same thing. And this is our problem.

When setting up federation between keystones (assuming keystone 2 keystone federation) you generate a mapping. This mapping can include the project name, but not the project_id. Theses ids are auto-generated, not deterministic by name, so creating the same project on different federated keystone servers will have different project_id‘s. When a keystone service provider (SP) federates with a keystone identity provider (IdP) the mapping they share shows how the provider should map federated users locally. This includes creating a shadow project if a project doesn’t already exist for the federated user to be part of.

Because there is no way to sync project_id’s in the mapping the SP will create the project which will have a unique project_id. Meaning when the federated user has authenticated their Swift storage endpoint from keystone will be different, in essence as far as Swift is concerned they will have access but to a completely different Swift account. Let’s use an example, let’s say there is a project on the IdP called ProjectA.

           project_name        project_id
  IdP      ProjectA            75294565521b4d4e8dc7ce77a25fa14b
  SP       ProjectA            cb0d5805d72a4f2a89ff260b15629799

Here we have a ProjectA on both IdP and SP. The one on the SP would be considered a shadow project to map the federated user too. However the project_id’s are both different, because they are uniquely  generated when the project is created on each keystone environment. Taking the Object Storage endpoint in keystone as our example before we get:

 

          Object Storage Endpoint
  IdP     http://swiftproxy:8080/v1/AUTH_75294565521b4d4e8dc7ce77a25fa14b
  SP      http://swiftproxy:8080/v1/AUTH_cb0d5805d72a4f2a89ff260b15629799

So when talking to Swift you’ll be accessing different accounts, AUTH_75294565521b4d4e8dc7ce77a25fa14b and AUTH_cb0d5805d72a4f2a89ff260b15629799 respectively. This means objects you write in one federated environment will be placed in a completely different account so you wont be able access them from elsewhere.

 

Interesting ways to approach the problem

Like I stated earlier the solution would simply be to always be able to return the same storage URL no matter which federated environment you authenticate to. But how?

  1. Make sure the same project_id/tenant_id is used for _every_ project with the same name, or at least the same name in the domains that federation mapping maps too. This means direct DB hacking, so not a good solution, we should solve this in code, not make OPs go hack databases.
  2. Have a unique id for projects/tenants that can be synced in federation mapping, also make this available in the keystone endpoint template mapping, so there is a consistent Swift account to use. Hey we already have project_id which meets all the criteria except mapping, so that would be easiest and best.
  3. Use something that _can_ be synced in a federation mapping. Like domain and project name. Except these don’t map to endpoint template mappings. But with a bit of hacking that should be fine.

Of the above approaches, 2 would be the best. 3 is good except if you pick something mutable like the project name, if you ever change it, you’d now authenticate to a completely different swift account. Meaning you’d have just lost access to all your old objects! And you may find yourself with grumpy Swift Ops who now need to do a potentially large data migration or you’d be forced to never change your project name.

Option 2 being unique, though it doesn’t look like a very memorable name if your using the project id, wont change. Maybe you could offer people a more memorable immutable project property to use. But to keep the change simple being able simply sync the project_id should get us everything we need.

 

When I was playing with this, it was for a presentation so had a time limit, a very strict one, so being a Swift developer and knowing the Swift code base I hacked together a varient on option 3 that didn’t involve hacking keystone at all. Why, because I needed a POC and didn’t want to spend most my time figuring out the inner workings of Keystone, when I could just do a few hacks to have a complete Swift only version. And it worked. Though I wouldn’t recommend it. Option 3 is very brittle.

 

The brittle method – Swift only side – Option 3b

Because I didn’t have time to simply hack keystone, I took a different approach. The basic idea was to let authtoken authenticate and then finish building the storage URL on the swift side using the meta-data authtoken dumps into wsgi request env. Thereby modifying the way keystoneauth authenticates slightly.

Step 1 – Give the keystoneauth middleware the ability to complete the storage url

By default we assume the incoming request will point to a complete account, meaning the object storage endpoint in keystone will end with something like:

'<uri>/v1/AUTH_%(tenant_id)s'

So let’s enhance keystoneauth to have the ability to if given only the reseller_prefix to complete the account. So I added a use_dynamic_reseller option.

If you enable use_dynamic_reseller then the keystoneauth middleware will pull the project_id from authtoken‘s meta-data dumped in the wsgi environment. This allows a simplified keystone endpoint in the form:

'<uri>/v1/AUTH_'

This shortcut makes configuration easier, but can only be reliably used when on your own account and providing a token. API elements like tempurl  and public containers need the full account in the path.

This still used project_id so doesn’t solve our problem, but meant I could get rid of the $(tenant_id)s from the endpoints. Here is the commit in my github fork.

Step 2 – Extend the dynamic reseller to include completing storage url with names

Next, we extend the keystoneauth middleware a little bit more. Give it another option, use_dynamic_reseller_name, to complete the account with either project_name or domain_name and project_name but only if your using keystone authentication version 3.

If you are, and want to have an account based of the name of the project, then you can enable use_dynamic_reseller_name in conjuction with use_dynamic_reseller to do so. The form used for the account would be:

<reseller_prefix><project_domain_name>_<project_name>

So using our example previously with a reseller_preix of AUTH_, a project_domain_name of Domain and our project name of ProjectA, this would generate an account:

AUTH_Domain_ProjectA

This patch is also in my github fork.

Does this work, yes! But as I’ve already mentioned in the last section, this is _very_ brittle. But this also makes it confusing to know when you need to provide only the reseller_prefix or your full account name. It would be so much easier to just extend keystone to sync and create shadow projects with the same project_id. Then everything would just work without hacking.

Keystone Federated Swift – Final post coming

This is a quick post to say the final topology post is coming. It’s currently in draft from and I hope to post it soon. I just realised it’s been a while so thought I’d better give an update.

 

The last post goes into what auth does, what is happening in keystone, what needs to happen  to really make this topology work and then talks about the brittle POC I created to have something to demo. I’ll be discussing other better options/alternative. But all this means it’s become much more detailed then I originally expected. I’ll hope to get it up by mid next week.

 

Thanks for waiting.

Monasca + Swift: Sending all your Swift metrics Monasca’s way

Last week was SUSE Hackweek. A week every employee is given to go have fun hacking something or learning something they find interesting. It’s an awesome annual event that SUSE runs. It’s my second and I love it.

While being snowed in in Dublin at the Dublin PTG a while ago I chatted with Johannes, a monasca dev and very intelligent team mate at SUSE. And I heard that Monasca has a statsd endpoint as a part of the monasca agent you can fire stats at. As a Swift developer this interests me greatly. Every Swift daemon dumps a plethora of statsd metrics. So can I put the 2 together? Can I simply install monasca-agent to each storage and proxy node and then point the statsd endpoints for all swift services locally?

 

I started the week attempting to do just that. Because I’m new to monasca, and didn’t want to go attempt to set it up, I just run a devsack + SAIO environment.

The devstack was just a simple monasa + keystone + horizon configuration and the SAIO was a standard Swift All In One.

 

Next I installed the monasca-agent to the SAIO and then updated Swift to point at it. In Swift each config supports a statsd server endpoint configuration options:

 

# You can enable StatsD logging here:
# log_statsd_host =
# log_statsd_port = 8125
# log_statsd_default_sample_rate = 1.0
# log_statsd_sample_rate_factor = 1.0
# log_statsd_metric_prefix =

 

So pointing swift is easy. I then uploaded as few objects to swift and bingo, inside Monasca’s influxdb instance I can see the Swift measurements.

 

account-auditor.passes
account-auditor.timing
account-replicator.attempts
account-replicator.no_changes
account-replicator.successes
account-replicator.timing
account-server.GET.timing
account-server.HEAD.timing
account-server.PUT.timing
account-server.REPLICATE.timing
container-auditor.passes
container-auditor.timing
container-replicator.attempts
container-replicator.no_changes [41/49393]
container-replicator.successes
container-replicator.timing
container-server.GET.timing
container-server.PUT.timing
container-server.REPLICATE.timing
container-updater.no_changes
container-updater.successes
container-updater.timing
monasca.collection_time_sec
monasca.thread_count
object-auditor.timing
object-replicator.partition.update.count.sdb1
object-replicator.partition.update.count.sdb2
object-replicator.partition.update.count.sdb3
object-replicator.partition.update.count.sdb4
object-replicator.partition.update.timing
object-replicator.suffix.hashes
object-server.HEAD.timing
object-server.PUT.sdb1.timing
object-server.PUT.sdb2.timing
object-server.PUT.sdb3.timing
object-server.PUT.sdb4.timing
object-server.PUT.timing
object-server.REPLICATE.timing
object-updater.timing
proxy-server.account.GET.200.first-byte.timing
proxy-server.account.GET.200.timing
proxy-server.account.GET.200.xfer
proxy-server.object.HEAD.404.timing
proxy-server.object.HEAD.404.xfer
proxy-server.object.PUT.201.timing
proxy-server.object.PUT.201.xfer
proxy-server.object.policy.1.HEAD.404.timing
proxy-server.object.policy.1.HEAD.404.xfer
proxy-server.object.policy.1.PUT.201.timing
proxy-server.object.policy.1.PUT.201.xfer

 

NOTE: This isn’t the complete list, as the measures are added when new metrics are fired, and the SAIO is a small healthy swift cluster, so there isn’t many 500 series errors etc. But it works!

 

And better yet I have access to them in grafana via the monasca datasource!

 

swift_recon check plugin

I thought that was easy, but Swift actually provides more metrics then just that. Swift has a reconnaissance API (recon) on all the wsgi servers (account, container and object servers). That you can hit either via REST or the swift-recon tool. So next I thought I wonder how hard it would be to write a swift_recon check plugin for Monasca.

Some of the recon metrics you can get aren’t really grafana friendly. But some would be awesome to have in the same place and closer to horizon where ops are looking.

 

So I went and wrote one. Like I said I couldn’t get all the metrics, but I got most:

 

swift_recon.account.account_auditor_pass_completed [2/49393]
swift_recon.account.account_audits_failed
swift_recon.account.account_audits_passed
swift_recon.account.account_audits_since
swift_recon.account.attempted
swift_recon.account.failure
swift_recon.account.replication_last
swift_recon.account.replication_time
swift_recon.account.success
swift_recon.container.attempted
swift_recon.container.container_auditor_pass_completed
swift_recon.container.container_audits_failed
swift_recon.container.container_audits_passed
swift_recon.container.container_audits_since
swift_recon.container.container_updater_sweep
swift_recon.container.failure
swift_recon.container.replication_last
swift_recon.container.replication_time
swift_recon.container.success
swift_recon.disk_usage.mounted
swift_recon.object.async_pending
swift_recon.object.attempted
swift_recon.object.auditor.object_auditor_stats_ALL.audit_time
swift_recon.object.auditor.object_auditor_stats_ALL.bytes_processed
swift_recon.object.auditor.object_auditor_stats_ALL.errors
swift_recon.object.auditor.object_auditor_stats_ALL.passes
swift_recon.object.auditor.object_auditor_stats_ALL.quarantined
swift_recon.object.auditor.object_auditor_stats_ALL.start_time
swift_recon.object.auditor.object_auditor_stats_ZBF.audit_time
swift_recon.object.auditor.object_auditor_stats_ZBF.bytes_processed
swift_recon.object.auditor.object_auditor_stats_ZBF.errors
swift_recon.object.auditor.object_auditor_stats_ZBF.passes
swift_recon.object.auditor.object_auditor_stats_ZBF.quarantined
swift_recon.object.auditor.object_auditor_stats_ZBF.start_time
swift_recon.object.expirer.expired_last_pass
swift_recon.object.expirer.object_expiration_pass
swift_recon.object.failure
swift_recon.object.object_updater_sweep
swift_recon.object.replication_last
swift_recon.object.replication_time
swift_recon.object.success
swift_recon.quarantined
swift_recon.unmounted

 

Some of the metric names might need to tidy up, but so far, so good. Some of the really interesting metrics Swift Ops usually want to keep an eye on is when have all the replicators completed a cycle. Why? Well one example is while ring rebalancing on a large and busy cluster you want to avoid too much data movement, so when adding new drives you will rise their weights slowly. But you also want to make sure a complete replication cycle is complete before you rebalance again. So knowing when you pushed a new ring out and the timestamps of the last run replication tells you when it’s safe. These are coming through nicely:

 

 

Unfortunately there are some metrics I can’t quite get though. You can use recon to get md5s of the rings and configs on each node. But I found md5s can’t get pushed through. You can also ask recon what version of swift is installed on each node (nice is a large deployment and when upgrading). But the version number also had issues. Both of these are probably not insurmountable, but I still need to figure out how.

 

swift_handoffs check plugin

I’ve been involved in the Swift community for quite a while now, and I’d had heard of another awesome metric one of the Swiftstack cores came out with to give an awesome visualisation of the Swift cluster. He even provided a gist to the community others would use and adapt. I thought, why not make sure everyone could use it, lets add it as another check plugin to the monasca agent.

 

Everything in Swift is treated as an object, and an object has a number of devices in the cluster who are considered primary (who store that object). When a drive gets full or there is too much load on say an object PUT, if a primary is unavailable to meet the durability contract another node will store the object (this node would be called a handoff for that object), the handoff node will push the handoff object to the primary as soon as it can (drive is replaced, or comes back online, etc).

Further, a ring in Swift is divided into logical segments called partitions. And it’s these partitions that devices are responsible for storing (or think of it as, it has to store all objects that belong to a partition). When we rebalance the ring, either by adding or removing drives or changing weights, these partitions shift around the cluster. Either to say drain a drive or to move to where where is more space. Swift is really good as minimising this movement to the minimum. So after a rebalance, nodes that used to be primaries for some partitions wont be anymore. They’ll suddenly be handoffs, and the back-end consistency engine will move them to their new home.

So what’s interesting to note there is, it all involves handoff partitions.

 

Turns out, by just watching the number of partitions vs the number of handoffs on each storage node gives you a great health indicator. When should I do a rebalance? when the handoffs numbers are down. There seem to be a build up of handoffs in a region, maybe write affinity and WAN links are saturated or there is some network/disk/server issue on one of the nodes around there etc.

Here are the metrics:

 

swift_handoffs.handoffs
swift_handoffs.primary

 

And here is a simplified snapshot. This is my SAIO with 4 simulated nodes. This is watching the storage nodes as a whole but you can break down to the drive. There is a graph for each node and each Swift ring. This rise in handoffs (Object – Policy 0 SAIO[1-3]) is due to me turning of the consistency engine and then changing the weight back to a nicely weighted cluster:

See Object - Policy 0. SAIO0’s weight has increased, so the other nodes now have handoff partitions to give him. If I now went and turned the consistency engine back on, you’d see more primary nodes on SAIO0.

 

Wheres the code

UPDATE: I’ve now pushed up the checks to monasca. They can be found here:

  • https://review.openstack.org/#/c/583876/
  • https://review.openstack.org/#/c/585067/

Keystone Federated Swift – Separate Clusters + Container Sync

This is the third post in the series of Keystone Federated Swift. To bounce back to the start you can visit the first post.

Separate Clusters + Container Sync

The idea with this topology is to deploy each of your OpenStack federated clusters each with their own unique swift cluster and then use another swift feature, container sync, to push objects you create on one federated environment to another.

In this case the keystone servers are federated. A very similar topology could be a global Swift cluster, but each proxy only talks to single region’s keystone. Which would mean a user visiting a different region would authenticate via federation and be able to use the swift cluster, however would use a different account name. In both cases container sync could be used to synchronise the objects, say from the federated account to that of the original account. This is because container sync can synchronise both between containers in separate clusters or in the same.

 

Setting up container sync

Setting up container sync is pretty straight forward. And is also well documented. At a high level to goes like this. Firstly you need to setup a trust between the different clusters. This is achieved by creating a container-sync-realms.conf file, the online example is:

[realm1]
key = realm1key
key2 = realm1key2
cluster_clustername1 = https://host1/v1/
cluster_clustername2 = https://host2/v1/

[realm2]
key = realm2key
key2 = realm2key2
cluster_clustername3 = https://host3/v1/
cluster_clustername4 = https://host4/v1/

 

Each realm is a set of different trusts. And you can have as many clusters in a realm as you want, so as youcan see you can build up different realms. In our example we’d only need 1 realm, and lets use some better names.

[MyRealm]
key = someawesomekey
key2 = anotherkey
cluster_blue = https://blueproxyvip/v1
cluster_green = https://greenproxyvip/v1

NOTE: there is nothing stopping you from only having 1 cluster defined as you can use container sync within a cluster, or adding more clusters to a single realm.

 

Now in our example both the green and blue clusters need to have the MyRealm realm defined in their /etc/swift/container-sync-realms.conf file. The 2 keys are there so you can do key rotation. These keys should be kept secret as these keys will be used to define trust between the clusters.

 

The next step is to make sure you have the container_sync middleware in your proxy pipeline. There are 2 parts to container sync, the backend daemon that periodically checks containers for new objects and sends changes to the other cluster, and the middleware that is used to authenticate requests sent by container sync daemons from other clusters. We tend to place the container_sync middleware before (to the left of) any authentication middleware.

 

The last step is to tell container sync what containers to keep in sync. This is all done via container meta-data which is controlled by the user. Let’s assume we have 2 accounts, AUTH_matt on the blue and AUTH_federatedmatt on the green. And we wanted to sync a container called mycontainer. Note, the containers don’t have to be called the same. Then we’d start by making sure the 2 containers have the same container sync key, which is defined by the owner of the container, this isn’t the realm keys but work in a similar way. And then telling 1 container to sync with the other.
NOTE: you can make the relationship go both ways.

 

Let’s use curl first:

$ curl -i -X POST -H 'X-Auth-Token: <token>' \
-H 'X-Container-Sync-Key: secret' \
'http://blueproxyvip/v1/AUTH_matt/mycontainer'

$ curl -i -X POST -H 'X-Auth-Token: <token>' \
-H 'X-Container-Sync-Key: secret' \
-H 'X-Container-Sync-To: //MyRealm/blue/AUTH_matt/mycontainer' \
'http://greenproxyvip/v1/AUTH_federatedmatt/mycontainer'

Or via the swift client, noting that you need to change identities to set each account.

# To the blue cluster for AUTH_matt
$ swift  post -k 'secret' mycontainer

 

# To the green cluster for AUTH_federatedmatt
$ swift  post \
-t '//MyRealm/blue/AUTH_matt/mycontainer' \
-k 'secret' mycontainer

In a federated environment, you’d just need to set some key for each of your containers you want to work on while your away (or all of them I guess). Then when you visit you can just add the sync-to metadata when you create containers on the other side. Likewise, if you knew the name of your account on the other side you could make a sync-to if you needed to work on something over there.

 

To authenticate containersync generates and compares a hmac on both sides where the hmac consists of both the realm and container keys, the verb, object name etc.

 

The obvious next question is great, but then do I need to know the name of each cluster, well yes, but you can simply find them by asking swift via the info call. This is done by hitting the /info swift endpoint with whatever tool you want. If your using the swift client, then it’s:

$ swift info

Pros and cons

Pros

The biggest pro for this approach is you don’t have to do anything special, if you have 1 swift cluster or a bunch throughout your federated environments the all you need to do it setup a container sync trust between them and the users can sync between themselves.

 

Cons

There are a few I can think off the top of my head:

  1. You need to manually set the metadata on each container. Which might be fine if it’s just you, but if you have an app or something it’s something else you need to think about.
  2. Container sync will move the data periodically, so you may not see it in the other container straight away.
  3. More storage is used. If it’s 1 cluster or many, the objects will exist in both accounts.

Conclusion

This is an interesting approach, but I think it would be much better to have access to the same set of objects everywhere I go and it just worked. I’ll talk about how to go about that in the next post as well as talk about 1 specific way I got working as a POC.

 

Container sync is pretty cool, Swiftstack have recently open sourced a another tool 1space, that can do something similar. 1space looks awesome but I haven’t have a chance to play with it yet. And so will add it to the list of Swift things I want to play with whenever I get a chance.

Keystone Federated Swift – False Federation

This is the second post in my series of posts on Swift in a Keystone federated environment, and the first post where I’ll walk through the first environment. The environment I’m calling ‘False Federation’. For details on these series of posts including the rationalisation see my last introductory post.

 

False Federation

This first environment doesn’t actually use Keystone federation, instead it uses an existing ability of Swift to have more then 1 authentication middleware in the proxy pipeline. Which is why I’m calling this ‘False Federation’.

Swift Reseller’s and the reseller_prefix

Swift, in an OpenStack environment, talks to Keystone for identity management through Keystone’s authtoken and the Swift keystoneauth middlewares. However Keystone isn’t required. Swift was designed to be a complete standalone storage solution, in fact many Swift deployments use different (like swauth) and sometimes custom authentication middlewares. This way people can easily integrate Swift into their own environments.

If you’ve spend anytime setting up authentication middlewares (like keystoneauth) in Swift, you’ve undoubtedly come across Swift’s reseller_prefix option, and maybe thought to yourself why?

 

As I mentioned earlier from the start Swift was designed to be an end to end standalone storage system. One of the features it has always supported is the idea of more then 1 authentication middleware in the pipeline. And if you have more then 1, then you need a way to distinguish which authentication middleware handles what account. This is what the reseller_prefix does. Swift will match the reseller_prefix prefixed to the account name with the authentication middleware who is to handle it.

This is actually a really powerful feature. It means you could resell your storage solution to other parties to manage accounts, or connect up different parts of your organisation, if say for some reason you have more then 1 source you want to use as an authentication service.

Some authentication middleware’s like Keystoneauth can even cover more then 1 reseller_prefix, this is how service tokens tend to be deployed, so a service can have it’s own namespace of a users for isolation and the data is safe from accidental deletion.

And yes, it’s also possible to set an empty reseller_prefix.

 

Multiple Keystone middlewares

Having got the idea of reseller_prefixes out of the way, this is the first potential solution and the idea behind ‘False Federation’. If you have a large Swift cluster, you could place the required authentication middlewares for each separate OpenStack environment you want to connect it to.

 

NOTE: The are 2 middlewares needed to connect to a single Keystone instance, Keystones authtoken and then Swifts keystoneauth. Other authentication middleware, like swauth and many custom ones, are only 1 middleware. So a little less confusing.

 

Before I get into the configuration I should also mention before you run off and give it a go. The current upstream keystoneauth in Swift doesn’t support being placed multiple times in a pipeline. Why? Because of the way places itself in the wsgi environment. But never fear, I have written a patch to correct this behavior specifically for these set’s of experiments, and when I get a chance to clean it up and write some tests I’ll push it upstream. In the meantime you can grab hold of the patch here.

 

I’m not going into huge amounts of detail on how to connect to Keystone, the Swift documentation and installation guides to that too well. And really your just duplicating exactly that, but to each Keystone endpoint you want to connect. If you need detailed instructions, then let me know. They say an image is worth more then a 1000 words. So here is a how it’s done in 1 pretty diagram:

The run down is:

  • Edit your proxy-server.conf on each node, and create ‘[filter:authtoken]’ and ‘[filter:keystoneauth]’ sections for each Keystone endpoint. Noting the names of the filters have to be different.
  • Each ‘[filter:authtoken]’ will point to an endpoint, and it’s corresponding ‘[filter:keystoneauth]’ will have a different reseller_prefix which will need to be matched in the Object Storage endpoint on the keystone servers service catalog. (see project documentation)
  • You then place these filters in the proxy pipeline. When placing a pair the authtoken must come before it’s keystoneauth other. But the pair’s ketstoneauth must also appear before then next authtoken (like in the picture).

 

NOTE: I’ve left of a bunch of middleware options in the picture to keep it small and readable.

 

Now if I send the following GET requests:
GET /v1/KEY_matt/pictures/cat.png
GET /v1/AUTH_matt/pictures/cat.png

 

The first would be authenticated on the blue keystone (or via ‘authtoken1 keystoneauth1’) and the second with the green keystone (or via ‘authtoken2 keystoneauth2’).

 

Cons

This approach was to demonstrate what Swift could already do. But there are some limitations to this approach. Which as always depends on your situation. Keystone’s authtoken middelware will always go and try an authenticate. So would add a bunch of latency to each request going through the proxy. If they are close maybe that’s ok. But if this was a geographical cluster with keystones all around the world then… ouch. If using a custom middleware, you’d just skip reseller_prefixes that don’t relate to you (like keystoneauth does).

 

Maybe you could have a different Swift proxy in each “region” that only points to the local keystone, so you are only authenticating locally.. ok. But then a user can’t come and access their data if they happen to be in a different region.. even though your talking to the same cluster.

So really what we want to do is take advantage of Keystone federation, where we only ever have to talk to 1 instance, the local one for the region a Swift proxy lives. That way we get the speed and the ability to access our data from anywhere.

 

Next time…

So the next post we’ll add real keystone federation, but assume each federation environment is it’s own cluster, including each has it’s own Swift cluster. In which case we could take advantage of another Swift feature, container sync.

Then the final post would be what we really want, 1 large Swift cluster with multiple Federated keystone OpenStack clusters. But that will involve fiddling with the federation sync metadata and need a more detailed explanation on how Swift authentication works. So first I want to cover what Swift can do simply with the tools it comes with!

Keystone Federated Swift – A series of posts

Matt Treinish and I proposed a presentation at the OpenStack Summit in Vancouver in May, it was accepted but on standby. Which simply means we have a lightening talk slot (10 minutes), but may be bumped up to a full slot based on how other presenters go (visa issues, pull outs, etc).

Anyway, 10 minutes wont do the topic justice, so I thought what better then to also post details as I work through them here. Some of what I say may end up in the presentation, or may not. All I know is I’ve been asked a few times how to setup Swift in a Keystone federated environment. Let’s face it, Swift scales to a global cluster no worries, however other OpenStack components may have trouble doing the same. So federating a bunch of different regions and treating them as their own clouds makes heaps of sense. Great, then what’s the best way of integrating Swift into this federated environment?

 

My current Idea is to walk through 3 initial topologies. The first I’ll call ‘false federation’ where we can simply use Swift’s ability to use multiple authentication middlewares as different resellers to be able to authenticate to multiple keystone endpoints. For those playing along at home, the keystone middleware currently doesn’t let you do this, but I have a trivial patch that fixes this.. and plan to push it upstream as soon as I have a chance to clean it up and add tests.

 

The second, is separate swift clusters in each cloud. But using Swifts container sync to move objects so you still have access to your data on any cloud you visit… eventually.

 

And finally the third is what we’d all want, I large swift cluster, that all clouds talk to, so no matter where you are, there your data is. Plus gives better durability, dispersion, and everything we want out of a Swift cluster. The trick here will be making sure the same swift account name is used no matter which keystone your talk to, and assume this will come down to how you configure what you share during federated token exchange. I’ll leave this as the last post and we still need to play to iron it out.. but obviously is the dream.

These diagrams are obviously overly simplistic, but I hope you get the idea.

The next post will be the ‘False federation’ approach seeing as I already have a swift keystoneauth middleware patch that solves this.

Setting up a basic keystone for Swift + Keystone dev work

As a Swift developer, most of the development works in a Swift All In One (SAIO) environment. This environment simulates a mulinode swift cluster on one box. All the SAIO documentation points to using tempauth for authentication. Why?

Because most the time authentication isn’t the things we are working on. Swift has many moving parts, and so tempauth, which only exists for testing swift and is configured in the proxy.conf file works great.

However, there are times you need to debug or test keystone + swift integration. In this case, we tend to build up a devstack for keystone component. But if all we need is keystone, then can we just throw one up on a SAIO?… yes. So this is how I do it.

Firstly, I’m going to be assuming you have SAIO already setup. If not go do that first. not that it really matters, as we only configure the SAIO keystone component at the end. But I will be making keystone listen on localhost, so if you are doing this on anther machine, you’ll have to change that.

Further, this will set up a keystone server in the form you’d expect from a real deploy (setting up the admin and public interfaces).

 

Step 1 – Get the source, install and start keystone

Clone the sourcecode:
cd $HOME
git clone https://github.com/openstack/keystone.git

Setup a virtualenv (optional):
mkdir -p ~/venv/keystone
virtualenv ~/venv/keystone
source ~/venv/keystone/bin/activate

Install keystone:
cd $HOME/keystone
pip install -r requirements.txt
pip install -e .
cp etc/keystone.conf.sample etc/keystone.conf

Note: We are running the services from the source so config exists in source etc.

 

The fernet keys seems to assume a full /etc path, so we’ll create it. Maybe I should update this to put all config there but for now meh:
sudo mkdir -p /etc/keystone/fernet-keys/
sudo chown $USER -R /etc/keystone/

Setup the database and fernet:
keystone-manage db_sync
keystone-manage fernet_setup

Finally we can start keystone. Keystone is a wsgi application and so needs a server to pass it requests. The current keystone developer documentation seems to recommend uwsgi, so lets do that.

 

First we need uwsgi and the python plugin, one a debian/ubuntu system you:
sudo apt-get install uwsgi uwsgi-plugin-python

Then we can start keystone, by starting the admin and public wsgi servers:
uwsgi --http 127.0.0.1:35357 --wsgi-file $(which keystone-wsgi-admin) &
uwsgi --http 127.0.0.1:5000 --wsgi-file $(which keystone-wsgi-public) &

Note: Here I am just backgrounding them, you could run then in tmux or screen, or setup uwsgi to run them all the time. But that’s out of scope for this.

 

Now a netstat should show that keystone is listening on port 35357 and 5000:
$ netstat -ntlp | egrep '35357|5000'
tcp 0 0 127.0.0.1:5000 0.0.0.0:* LISTEN 26916/uwsgi
tcp 0 0 127.0.0.1:35357 0.0.0.0:* LISTEN 26841/uwsgi

Step 2 – Setting up keystone for swift

Now that we have keystone started, its time to configure it. Firstly you need the openstack client to configure it so:
pip install python-openstackclient

Next we’ll use all keystone defaults, so we only need to pick an admin password. For the sake of this how-to I’ll pick the developer documentation example of `s3cr3t`. Be sure to change this. So we can do a basic keystone bootstrap with:
keystone-manage bootstrap --bootstrap-password s3cr3t

Now we just need to set up some openstack env variables so we can use the openstack client to finish the setup. To make it easy to access I’ll dump them into a file you can source. But feel free to dump these in your bashrc or whatever:
cat > ~/keystone.env <<EOF
export OS_USERNAME=admin
export OS_PASSWORD=s3cr3t
export OS_PROJECT_NAME=admin
export OS_USER_DOMAIN_ID=default
export OS_PROJECT_DOMAIN_ID=default
export OS_IDENTITY_API_VERSION=3
export OS_AUTH_URL=http://localhost:5000/v3
EOF


source ~/keystone.env

 

Great, now  we can finish configuring keystone. Let’s first setup a service project (tennent) for our Swift cluster:
openstack project create service

Create a user for the cluster to auth as when checking user tokens and add the user to the service project, again we need to pick a password for this user so `Sekr3tPass` will do.. don’t forget to change it:
openstack user create swift --password Sekr3tPass --project service
openstack role add admin --project service --user swift

Now we will create the object-store (swift) service and add the endpoints for the service catelog:
openstack service create object-store --name swift --description "Swift Service"
openstack endpoint create swift public "http://localhost:8080/v1/AUTH_\$(tenant_id)s"
openstack endpoint create swift internal "http://localhost:8080/v1/AUTH_\$(tenant_id)s"

Note: We need to define the reseller_prefix we want to use in Swift. If you change it in Swift, make sure you update it here.

 

Now we can add roles that will match to roles in Swift, namely an operator (someone who will get a Swift account) and reseller_admins:
openstack role create SwiftOperator
openstack role create ResellerAdmin

Step 3 – Setup some keystone users to auth as.

TODO: create all the tempauth users here

 

Here, it would make sense to create the tempauth users devs are used to using, but I’ll just go create a user so you know how to do it. First create a project (tennent) for this example demo:
openstack project create --domain default --description "Demo Project" demo

Create a user:
openstack user create --domain default --password-prompt matt

We’ll also go create a basic user role:
openstack role create user

Now connect the 3 pieces together by adding user matt to the demo project with the user role:
openstack role add --project demo --user matt user

If you wanted user matt to be a swift operator (have an account) you’d:
openstack role add --project demo --user matt SwiftOperator

or even a reseller_admin:
openstack role add --project demo --user matt ResellerAdmin

If your in a virtual env, you can leave it now, because next we’re going to go back to your already setup swift to do the Swift -> Keystone part:
deactivate

Step 4 – Configure Swift

To get swift to talk to keystone we need to add 2 middlewares to the proxy pipeline. And in the case of a SAIO, remove the tempauth middleware. But before we do that we need to install the keystonemiddleware to get one of the 2 middlware’s, keystone’s authtoken:
sudo pip install keystonemiddleware

Now you want to replace your tempauth middleware in the proxy path pipeline with authtoken keystoneauth so it looks something like:
pipeline = catch_errors gatekeeper healthcheck proxy-logging cache bulk tempurl ratelimit crossdomain container_sync authtoken keystoneauth staticweb copy container-quotas account-quotas slo dlo versioned_writes proxy-logging proxy-server

Then in the same ‘proxy-server.conf’ file you need to add the paste filter sections for both of these new middlewares:
[filter:authtoken]
paste.filter_factory = keystonemiddleware.auth_token:filter_factory
auth_host = localhost
auth_port = 35357
auth_protocol = http
auth_uri = http://localhost:5000/
admin_tenant_name = service
admin_user = swift
admin_password = Sekr3tPass
delay_auth_decision = True
# cache = swift.cache
# include_service_catalog = False

[filter:keystoneauth]
use = egg:swift#keystoneauth
# reseller_prefix = AUTH
operator_roles = admin, SwiftOperator
reseller_admin_role = ResellerAdmin

Note: You need to make sure if you change the reseller_prefix here, you change it in keystone. And notice this is where you map operator_roles and reseller_admin_role in swift to that in keystone. Here anyone in with the keystone role admin or SwiftOperator are swift operators and those with the ResellerAdmin role are reseller_admins.

 

And that’s it. Now you should be able to restart your swift proxy and it’ll go off and talk to keystone.

 

You can use your Python swiftclient now to go talk, and whats better swiftclient understands the OS_* variables, so you can just source your keystone.env and talk to your cluster (to be admin) or export some new envs for the user you’ve created. If you want to use curl you can. But _much_ easier to use swiftclient.

 

Tip: You can use: swift auth to get the auth_token if you want to then use curl.

 

If you want to authenticate via curl then for v3, use: https://docs.openstack.org/developer/keystone/devref/api_curl_examples.html

 

Or for v2, I use:
url="http://localhost:5000/v2.0/tokens"
auth='{"auth": {"tenantName": "demo", "passwordCredentials": {"username": "matt", "password": ""}}}'

 

curl -s -d "$auth" -H 'Content-type: application/json' $url |python -m json.tool

 

or

curl -s -d "$auth" -H 'Content-type: application/json' $url |python -c "import sys, json; print json.load(sys.stdin)['access']['token']['id']"

To just print out the token. Although a simple swift auth would do all this for you.

pudb debugging tips

As an OpenStack Swift dev I obviously write a lot of Python. Further Swift is cluster and so it has a bunch of moving pieces. So debugging is very important. Most the time I use pudb and then jump into the PyCharms debugger if get really stuck.

Pudb is curses based version of pdb, and I find it pretty awesome and you can use it while ssh’d somewhere. So I thought I’d write a tips that I use. Mainly so I don’t forget 🙂

The first and easiest way to run pudb is use pudb as the python runner.. i.e:

pudb <python script>

On first run, it’ll start with the preferences window up. If you want to change preferences you can just hit ‘<ctrl>+p’. However you don’t need to remember that, as hitting ‘?’ will give you a nice help screen.

I prefer to see line numbers, I like the dark vim theme and best part of all, I prefer my interactive python shell to be ipython.

While your debugging, like in pdb, there are some simple commands:

  • n – step over (“next”)
  • s – step into
  • c – continue
  • r/f – finish current function
  • t – run to cursor
  • o – show console/output screen
  • b – toggle breakpoint
  • m – open module
  • ! – Jump into interactive shell (most useful)
  • / – text search

There are obviously more then that, but they are what I mostly use. The open module is great if you need to set a breakpoint somewhere deeper in the code base, so you can open it, set a breakpoint and then happily press ‘c’ to continue until it hits. The ‘!’ is the most useful, it’ll jump you into an interactive python shell in the exact point the debugger is at. So you can jump around, check/change settings and poke in areas to see whats happening.

As with pdb you can also use code to insert a breakpoint so pudb will be triggered rather then having to start a script with pudb. I give an example of how in the nosetest section below.

nosetests + pudb

Sometimes the best way to use pudb is to debug unit tests, or even write a unit (or functaional or probe) test to get you into an area you want to test. You can use pudb to debug these too. And there are 2 ways to do it.

The first way is by installing the ‘nose-pudb’ pip package:

pip install nose-pudb

Now when you run nosetests you can add the –pudb option and it’ll break into pudb if there is an error, so you go poke around in ‘post-mortem’ mode. This is really useful, but doesn’t allow you to actually trace the tests as they run.

So the other way of using pudb in nosetests is actually insert some code in the test that will trigger as a breakpoint and start up pudb. To do so is exactly how you would with pdb, except substitute for pudb. So just add the following line of code to your test where you want to drop into pudb:

import pudb; pudb.set_trace()

And that’s it.. well mostly, because pudb is command line you need to tell nosetests to not capture stdout with the ‘-s’ flag:

nosetests -s test/unit/common/middleware/test_cname_lookup.py

testr + pudb

Not problem here, it uses the same approach as above. Where you programmatically set a trace, as you would for pdb. Just follow the  ‘Debugging (pdb) Tests’ section on this page (except substitute pdb for pudb)

 

Update – run_until_failure.sh

I’ve been trying to find some intermittent unit test failures recently. So I whipped up  a quick bash script that I run in a tmux session that really helps find and deal with them, I thought I’d add to this post as I then can add nose-pudb to make it pretty useful.

#!/bin/bash

n=0
while [ True ]
do 
  clear
  $@
  if [ $? -gt 0 ]
  then 
    echo 'ERROR'
    echo "number " $n
    break
  fi
  let "n=n+1"
  sleep 1
done

With this I can simply:
run_until_failure.sh tox -epy27

 

It’ll stop looping once the command passed returns something other then 0.

Once I have an error, I have then been focusing in on the area it happens (to speed up the search a bit), I can also use nose-pudb to drop me into post-mortem mode so I can poke around in ipython, for example, I’m currently running:

 

run_until_failure.sh nosetests --pudb test/unit/proxy/test_server.py

 

Then I can come back to the tmux session, if I’m dropped in a pudb interface, I can go poke around.

Swift Container sharding – locked db POC – Benchmarking observations

The latest POC is at the benchmarking stage, and in the most part it’s going well. I have set up 2 clusters in the cloud, not huge, but 2 proxies and 4 storage nodes each. A benchmarking run involves pointing an ssbench master at each cluster and putting each cluster under load. In both cases we only use 1 container, and on one cluster this container will have sharding turned on.

So far it’s looking pretty good. I’ve done many runs, and usually find a bug at scale.. but as of recently I’ve done two runs of the latest revision alternating the sharded cluster (the cluster that will be benchmarking with the container with sharding on). Below shows the grafana statsd output of the second run. Note that cluster 2 is the sharded cluster in this run:

2016-12-22-0928_cluster2_run2_smaller

Looking at the picture there are a few observations we can make, the peaks in the ‘Container PUT Latency – Cluster 2’ correspond when a container is sharded (in this case, the one container and then shards sharding).

As I mentioned earlier ssbench is running the benchmark and the benchmark is very write (PUT) heavy. Here is the sharding scenario file:

{
  "name": "Sharding scenario",
  "sizes": [{
    "name": "zero",
    "size_min": 0,
    "size_max": 0
  }],
  "initial_files": {
    "zero": 100
  },
  "run_seconds": 86400,
  "crud_profile": [200, 50, 0, 5],
  "user_count": 2,
  "container_base": "shardme",
  "container_count": 1,
  "container_concurrency": 1,
  "container_put_headers": {
  "X-Container-Sharding": "on"
  }
}

The only difference with this and non-sharding one is not setting the X-Container-Sharding meta on the initial container PUT. The crud profile shows that we are heady on PUTs and GETs. But because jobs are randomised, I don’t expect the exact the same numbers when it comes to object count on the servers however there is a rather large discrepancy with the object counts on both servers:

Cluster 1:

HTTP/1.1 204 No Content
Content-Length: 0
X-Container-Object-Count: 11291190
Accept-Ranges: bytes
X-Storage-Policy: gold
X-Container-Bytes-Used: 0
X-Timestamp: 1482290574.52856
Content-Type: text/plain; charset=utf-8
X-Trans-Id: tx9dd499df28304b2d920aa-00585b2d3e
Date: Thu, 22 Dec 2016 01:32:46 GMT

Cluster 2:

Content-Length: 0
X-Container-Object-Count: 6909895
X-Container-Sharding: True
X-Storage-Policy: gold
X-Container-Bytes-Used: 0
X-Timestamp: 1482290575.94012
Content-Type: text/plain; charset=utf-8
Accept-Ranges: bytes
X-Trans-Id: txba7b23743e0d45a68edb8-00585b2d61
Date: Thu, 22 Dec 2016 01:33:27 GMT

So cluster 1 has about 11 million objects and cluster 2 about 7 million. That quite a difference. Which gets me wondering what’s causing such a large difference in PUT through put?

The only real difference in the proxy object PUT when comparing sharded to unsharded is the finding of the shard container the object server will need to update, in which case another request is made to the root container asking for the pivot (if there is one). Is this extra request really causing an issue? I do note the object-updater (last graph in the image) is also working harder, as the number of successes during the benchmarks are much higher, meaning there are more requests falling into async pendings.

Maybe the extra updater work is because of the extra load on the container server (this additional request)?

To test this theory, I can push the sharder harder and force container updates into the root container. This would stop the extra request.. but force more traffic to the root container (which we are kinda doing anyway). We should still see benefits as root container would be much smaller (because it’s sharded) then the non sharded counter part. And this will allow us to see if this is causing the slower through put.

Update: I’m currently running a new scenario which is all PUTs so lets see how that fairs. Will keep you posted.

Swift + Xena would make the perfect digital preservation solution

Those of you might not know, but for some years I worked at the National Archives of Australia working on, at the time, their leading digital preservation platform. It was awesome, opensource, and they paid me to hack on it.
The most important parts of the platform was Xena and Digital Preservation Recorder (DPR). Xena was, and hopefully still is amazing. It takes in a file, guesses the format. If it’s a closed proprietary format and it had the right xena plugin it would convert it to an open standard and optionally turned it into a .xena file ready to be ingested into the digital repository for long term storage.

We did this knowing that proprietary formats change so quickly and if you want to store a file format long term (20, 40, 100 years) you won’t be able to open it. An open format on the other hand, even if there is no software that can read it any more is open, so you can get your data back.

Once a file had passed through Xena, we’d use DPR to ingest it into the archive. Once in the archive, we had other opensource daemons we wrote which ensured we didn’t lose things to bitrot, we’d keep things duplicated and separated. It was a lot of work, and the size of the space required kept growing.

Anyway, now I’m an OpenStack Swift core developer, and wow, I wish Swift was around back then, because it’s exactly what is required for the DPR side. It duplicates, infinitely scales, it checks checksums, quarantines and corrects. Keeps everything replicated and separated and does it all automatically. Swift is also highly customise-able. You can create your own middleware and insert it in the proxy pipeline or in any of the storage node’s pipelines, and do what ever you need it to do. Add metadata, do something to the object on ingest, or whenever the object is read, updating some other system.. really you can do what ever you want. Maybe even wrap Xena into some middleware.

Going one step further, IBM have been working on a thing called storlets which uses swift and docker to do some work on objects and is now in the OpenStack namespace. Currently storlets are written in Java, and so is Xena.. so this might also be a perfect fit.

Anyway, I got talking with Chris Smart, a mate who also used to work in the same team at NAA, so it got my mind thinking about all this and so I thought I’d place my rambling thoughts somewhere in case other archives or libraries are interested in digital preservation and needs some ideas.. best part, the software is open source and also free!

Happy preserving.