Friday, 29 March 2013

Does Windows Azure comply with UK/EU Data Privacy Law?

TL;DR

Yes, it does.  The Information Commissioner’s Office in the UK considers that Windows Azure provides adequate protection for the rights of individuals with regard to their data that is held on Windows Azure. 

Given that you have the other relevant measures in place to comply with the Data Protection Act, this means that you go ahead and use Windows Azure to store your customers’ personal information.  As Microsoft comply with the EU/US Safe Harbor Framework, the data can be considered to remain solely within the EU geographic region.

Background

As developers we often need to store information about individuals, such as their name and email address.  When people share this information with us, they expect that we will treat the information properly and respect their privacy. 

In the UK, the Data Protection Act provides a legal framework to define how companies capture and process information about individuals.  For instance, they may choose to let you send them an email newsletter, but tell you that they do not want to receive then from other companies that you work with.

In the UK, the Information Commissioner’s Office (ICO) is the body responsible for upholding information rights and privacy for individuals.  The ICO can and do fine companies who do not treat individuals information properly.  These fines are quite large too!

To the Cloud

As developers we increasingly want to be able to use hosted cloud services to build our web sites and internet enabled applications.  There are a number of choices out there that include large offerings such as Windows Azure and Amazon Web Services, but also more targeted offerings such as MongoHQ.

If we store any information about an individual (in the UK the law relates to living individuals), then any cloud service that we use as developers must ensure that the data about the individuals is protected.

Keep it local

Other member states of the European Union have similar data protection laws and there is a common legal framework to protect the rights of individuals in the EU.

To be able to benefit from the protection that this legal framework gives, the individual’s data has to remain physically within the EU.  If the data were to be held in another country outside of the EU, then the laws of that country would apply to that data.  For example, the US has a very different approach to the protection of individuals data than we do in the EU.

Back to the Cloud

Let’s look at how Amazon AWS and Microsoft Azure - two popular US cloud providers - handle this.

Amazon make the statement that any data hosted in one of their EU regions will remain within that region.  You can read that here.  Okay, not much detail in that, but it sounds fine.

Azure talk a little more about this than Amazon and you can read about that here.  If you are eagle eyed, then you will notice that data in Azure EU regions will generally remain in the EU, but may be shipped to the US for disaster recovery purposes.

Oh dear, that sounds like it breaks the condition that data has to remain physically within the EU.

Can I use Azure then?

Yes, you can.  The reasons for this - also stated on Azure Trust Centre Privacy page – is that Microsoft are certified as complying with the EU/US Safe Harbor Framework

This is a legal framework between the EU and the US Department of Commerce that essentially provides EU compliant data protection for data that is moved to the US under the Safe Harbor Framework.  The ICO deem that the Safe Harbor Framework provides adequate protection for the rights of the individual whose data may be transferred to the US by Microsoft.  You can read about that here.

That’s simple then – why the doubt?

So, if it’s that easy, why am I writing this article in the first place?  Well, I’ve been looking at Azure for a while now and wanting to use it for some of my applications.  The advice I had received in the past was basically that once the data was in the US, other US laws, such as the Patriot Act, could override the Safe Harbor Framework and remove the legal protections provided by the Safe Harbor Framework. 

If that was the case, then I would need treat it as if it were being shipped outside of the EU and under the jurisdiction of different data protection laws.  Not something that my customers would want to hear!  Also, the reason why I’d been using Amazon AWS.

I recently spun up a build server on an Azure VM and I absolutely loved the experience of using Azure.  I was thinking what a shame it was that I couldn’t use it more and so I got in touch with the venerable Mr Hanselman, Microsoft developer evangelist extraordinaire, to say “please can we use Azure in Europe?” (I had previously tried other routes, but without getting any answers).

Scott kindly took my question back to the Azure team and then came back with a bunch of challenges for me.  The summary of those being that all I needed was already on the Azure Trust Centre Privacy page.  And, he was quite right too! 

I got onto the phone to the ICO and asked them about it, and they confirmed that this provided “adequate protection for the rights of individuals” and that there are exceptions to data protection law to allow circumstances such as the Courts or the Police to have access as part of their investigations – both in the EU as well as the US – and that I could now just focus on complying with the Data Protection Act requirements. 

Awesome – my future is now on Azure!

A note of caution

It’s worth remembering that the location of the data is just one part of the process of protecting your customers’ data.  You need to make sure that you comply with all other aspects of the relevant data protection laws. 

Generally, the more sensitive the information and the more people about whom you hold that sort of information, the less likely you will be able to host it externally.

In the UK, if you’re not sure about how to go about things, contact the ICO for advice – they’re very helpful people!

Wednesday, 23 May 2012

Embedded RavenDB Indexes with Obfuscation

I ran into an issue today with using RavenDB from an obfuscated assembly.  RavenDB started giving some nasty looking errors like:

System.InvalidOperationException: Could not understand query:
-- line 2 col 43: invalid NewExpression

After bashing my head against this for too long, I had run out of ideas and so I posted the problem on the RavenDB group.  One of the really good things about RavenDB is the amazing support from both the Hibernating Rhinos team and the community on this group.  So, pretty soon it was problem solved!

The problem: the obfuscation was renaming the anonymous types that are used when defining indexes from code in RavenDB.

The solution: either put your types into a non-obfuscated assembly or tell the obfuscator to stop renaming anonymous types.  Let’s look at the second option a bit more.

Okay, so how do you detect an anonymous type?  One way is to look at their names.  The compiler gives anonymous types names and puts the text “AnonymousType” into the name. For example:

new { Name = "Sean"}.GetType().Name // Gives: <>f__AnonymousType0`1

Simple enough, but there is a caveat:  the naming of anonymous types is an implementation detail and may vary with different compiler implementations.  You cannot rely in this working with different compilers. 


So, with that in mind, let’s look at a solution…


Different obfuscators allow you to add obfuscation rules in various ways.  In this repo there is an example of using Eazfuscator with a RavenDB index.  (Note that you will need to install Eazfuscator to be able to run the code.)  All that is needed is to use an assembly scoped ObfuscationAttribute to prevent renaming of any type whose name contains the text “AnonymousType”:

[assembly: Obfuscation(Feature = "Apply to type *AnonymousType*: renaming", Exclude = true, ApplyToMembers = true)]



Bingo, everything works as it should again…happy days!

Saturday, 21 April 2012

RavenDB Magic – Modelling Polymorphic Collections

I'm starting to port my main application over to RavenDB.  A driving design principle for RavenDB is to reduce friction for the developer and the main reason I’m choosing RavenDB is to make my life easier.  Here's one example of that...

Let’s say that you’ve got a model with a marker interface and some types implementing that interface.  Something like this:
   public interface ILookup { }

   public class IntegerLookup : ILookup
   {
      public int Number { get; set; }
   }
   
   public class StringLookup : ILookup
   {
      public string Text { get; set; }
   }
Then you have a type that holds a collection of the interface:
   public class Container
   {
      public ILookup[] Lookups { get; set; }
   }
All good so far, and now you want to store this in a database.  Not quite so simple!  If you’re using a relational database, whether through an ORM or another means, you will have to do quite a bit more work to get your objects into the database and back.

However, if you’re using RavenDB then your work is done!  Well, nearly - all you need to do is to add an Id property to your container to tell RavenDB to store it as a root:
   public class Container
   {
      public string Id { get; set; }
      public ILookup[] Lookups { get; set; }
   }

That’s it – the rest is pure Raven magic!  Here’s a test showing how Raven it handles it:
   [Test]
   public void can_store_polymorphic_collection()
   {
      using (var store = new EmbeddableDocumentStore
      {
         DataDirectory = "Data",
         RunInMemory = true
      })

      {
         store.Initialize();

         var holder = new Container
               {
                  Lookups = new ILookup[]
                              {
                                 new StringLookup {Text = "Hello"},
                                 new IntegerLookup {Number = 123}
                              }
               };
         using (var session = store.OpenSession())
         {
            session.Store(holder);
            session.SaveChanges();
         }

         using (var session = store.OpenSession())
         {
            var h = session.Load<Container>(holder.Id);

            Assert.That(h.Lookups
               .OfType<StringLookup>().Single().Text == "Hello");
            Assert.That(h.Lookups
               .OfType<IntegerLookup>().Single().Number == 123);
         }
      }
   }
You just new up an instance of Container, add in a couple of ILookup instances, then save it.  When I load it back up, it’s all there just as you need it to be.  Now, that is seriously impressive!

Thursday, 28 July 2011

RavenDB & NServiceBus -

The upcoming version 3.0 of everyone’s favourite service bus, NServiceBus, will now use RavenDB for saga persistence.  This makes it even easier to work with sagas in NSB.
Very cool!  Even better than that, it looks like RavenDB will be available for NServiceBus Express users too.  You can read Udi’s comment here.
Big thanks to Ayende and Udi for getting these two elegant technologies together like this. 

Sunday, 27 February 2011

Mocking DbContext Entity Framework 4 Code First CTP5 with NSubstitute

Take an Entity Framework 4 Code First model, something like this:

   1: public class Customer
   2: {
   3:    public int Id { get; set; }
   4:    public string Name { get; set; }
   5: }
   6:  
   7: public class CustomerContext : DbContext, ICustomerContext
   8: {
   9:    public DbSet<Customer> Customers { get; set; }
  11: }


I now want to write some tests and I'm going to mock the DbContext using NSubstitute.  I first try something like this:

   1: [Test]
   2: public void can_mock_customer_context()
   3: {
   4:    var context = Substitute.For<CustomerContext>();
   5:    context.Customers.Returns(
   6:       new DbSet<Customer>(
   7:          new[]
   8:          {
   9:             new Customer {Name = "Sean"}
  10:          })
  11:       );
  12:    Assert.AreEqual(1, context.Customers.Count());
  13: }

The problem is that the DbSet constructor is internal (as of EF4 Code First CTP5).  So, let's abstract our DB access to a simple interface and replace DbSet<T> with the IQueryable<T> interface, ending up with the below:

   1: public interface ICustomerContext
   2: {
   3:    IQueryable<Customer> Customers { get; set; }
   4: }

This interface can be implemented like so:

   1: public class CustomerContext : DbContext, ICustomerContext
   2:   {
   3:      public DbSet<Customer> Customers { get; set; }
   4:      IQueryable<Customer> ICustomerContext.Customers { get { return Customers; } }
   5:   }

Now all we need to do is to use an implementation of IQueryable<T> in our mock.  I'm going to use EnumerableQuery<T> which gives me the following test that now passes:

   1: [Test]
   2: public void can_mock_customer_context()
   3: {
   4:    var context = Substitute.For<ICustomerContext>();
   5:    context.Customers.Returns(
   6:       new EnumerableQuery<Customer>(
   7:          new[]
   8:          {
   9:             new Customer {Name = "Sean"}
  10:          })
  11:       );
  12:    Assert.AreEqual(1, context.Customers.Count());
  13: }
  14:    }

I'm new to NSubstitute but it seem to be the lowest friction mocking library out there.  Just perfect for use with Entity Framework 4 Code First - certainly the lowest friction ORM there is today!

Note that we could have used the repository pattern to wrap the DbContext instead of a simple interface, the approach is almost identical.

Update: Ro Miller has an alternative approach using fakes that does a better job of surfacing IDbSet.  Check it out here: http://romiller.com/2010/09/07/ef-ctp4-tips-tricks-testing-with-fake-dbcontext/.

Thursday, 17 February 2011

Package Manager With Source Download

I was skinning up some code and I used my package manager of choice to add NLog to the project.  All good, very easy and away I coded.

The project required a single DLL as an output.  So, I thought hey – I’ll use Paul Stovell’s Tape to save using ILMerge in the build.  Good plan, but after 5 minutes browsing the NLog source on GitHub it realised I was not sure how to find the version I needed.  That’s fine, ILMerge is pretty easy to use so I used the build to save the day.

On the same day (today) I read this post about ReSharper 6 features and suddenly realised how powerful it would be for package managers to be able to give me the source, not just binaries.  Now that’s a killer feature and would be even better if the source came with a single file option so I didn’t even have to run Tape.

Oh, please, soon…

Thursday, 3 February 2011

A Better Deal

Red Gate are now going to charge for Lutz Roeder’s Reflector.  This tool has been free for it's life – more years than I care to remember.  Its as much part of a software developer’s toolkit as a hammer is part of a builder’s toolkit.

I have dealt with Red Gate as part of a very small company and as part of a very large company.  I am sorry to say that each time I have come away feeling less than happy.  Any company that sends be a maintenance bill at the end of year one that exceeds my original purchase price isn't going to make me happy to deal with them!

Reading Red Gates’ pricing strategy, written by their co-founder, Neil Davidson, I struggle to believe their other CEO, Simon Galbraith, when he says how sorry he is about charging for Reflector.  I just kept thinking that they just want my cash - like the other times I’ve dealt with them.

Maybe Galbraith and Davidson should go and listen to Seth Godin’s pricing advice.  Maybe then I’d want to buy stuff from them! 

35USD is good value for Reflector and I may have to buy a copy.  I’d really just rather buy from another company and get a better deal.