Dustin Horne

Developing for fun...

Deeper Dive Into EF Core 2 - Part 3

Welcome to Part 3 of our deeper dive into Entity Framework Core 2!  In Part 1 of this series we looked at creating data models and setting them up for different join types.  We also looked at setting up our DbContext and and getting our first migrations going and scripted.  In Part 2, we explored updating our data models, creating additional migrations and applying them to our database.  We also looked at dynamically creating our database at application start and applying our migrations as well as seeding with some additional metadata.  In this post we're going to put the pieces together and consume our new data layer from a simple api.  We're going to make everything async front to back.  Let's get started.

The first two parts of this series discuss various database connection options and configuration which I won't cover again in this post, so if you find something not working, please refer to the first two parts of this series.  I would highly recommend you start from the beginning but if you'd like to start here, you can downloaded the completed solution from part 2 here:

TeamSamplePart2.zip (16.13 kb)

First Thing's First - A Little Cleanup

In the second part of this series, we initialized our database in our Program.cs file.  To start things off, I want to show you another option so you can keep your Program.cs file a bit cleaner.  Instead, we're going to perform our database initialization back in our Startup.cs class.  Go ahead and return to the Program.cs file in your TeamSample.Api project and revert the Main method to the following:

public static void Main(string[] args)
{
    BuildWebHost(args).Run();
}

With our Program.cs file tidied up, let's head back to our Startup.cs file.  Of particular interest this time is the Configure method.  In our last post, we used the ConfigureServices method to setup our DbContext for dependency injection.  We also created a scope in the main entry point of Program.cs.  We're going to do something similar here, except we're now going to create our scope from our "app" parameter.  The app parameter is an IApplicationBuilder which gives us access to the services registered with our application and dependency injection via the ApplicationServices property.  Before app.UseMvc(), go ahead and add the following code:

using (var scope = app.ApplicationServices.CreateScope())
{
    var services = scope.ServiceProvider;
    var context = services.GetRequiredService<TeamSampleDbContext>();
    TeamSampleDbInitializer.Initialize(context);
}

The code we've added here is almost identical to what we removed from Program.cs.  The only difference is that we've replaced "host.Services" with "app.ApplicationServices".  This keeps things a bit cleaner and even makes it a bit easier to move some of our configuration out to other config classes later if we find our Configure method to be getting a bit too busy.

Using Our Context Through DI

Now let's take a look at creating a simple API and leveraging our DbContext injected via dependency injection.  To keep things simple, we're going to inject our DbContext straight into our API controller.  However, in practice you're likely to have more complex business logic surrounding your data access and you're going to want it to be highly testable at a granular level.  To enable this and keep your controllers lean and mean, you'll likely want to create a separate logic layer and inject your logic instead, but these concepts are a bit beyond the scope of what we're covering, so we'll stick with just the API controllers for now.

Let's start by creating a basic API controller and call it PositionsController.  If a Controllers folder doesn't exist in your TeamSample.Api project, go ahead and add it.  Right click the Controllers folder in the TeamSample.Api project and choose Add, but rather than "Controller", we're going to choose New Item and add a class.  Call it PositionsController.cs and save.

While not absolutely necessary, you can modify this class to inherit from the base Controller class.  This will give you some nice helpers that you can use, such as getting access to the Response.  While we're not going to be doing any exception handling to keep the example simple, we will use this to modify the status code of the returned response to signal different request states.

You'll also add a constructor to your controller which takes the TeamSampleDbContext as a parameter and initializes a private member with it.  Finally, you'll add a Route attribute to your controller class to specify the url route that points to the api.  We'll use "/api/[controller]" as the route pattern.  This uses the special [controller] syntax that routes to the name of the given controller.  Here is the code for our new API class:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Mvc;
using TeamSample.Data;

namespace TeamSample.Api.Controllers
{
    [Route("/api/[controller]")]
    public class PositionsController : Controller
    {
        private readonly TeamSampleDbContext _context;

        public PositionsController(TeamSampleDbContext context)
        {
            _context = context;
        }
    }
}

This puts us in a good spot and we can now start adding action methods to return the bits of data we need.  There is one more thing we need to do before we start returning data to our users.  We want to avoid directly returning our database entities via the API, so we're going to create some models to act as our results. 

There are several reasons for this.  We may only be interested in partial data and it doesn't make sense to return payloads through the API with data we don't need.  We also may want subsets of data to prevent potentially sensitive information from being returned in the event that someone updates the data model in the database.  Someone may also decide they want lazy loading enabled, and serialization of said entities could then cause the dreaded N+1 issue where multiple queries end up inadvertently executed against the database, or worse yet, returning the entire database.

We want to keep our API pretty lean.  In our case we're only interested in returning the Id, Name and Abbreviation associated with each position.  Let's go ahead and create the model to represent this data.  Add a Models folder to the TeamSample.Api project.  Note that if this were intended to be a public facing API, it may be desirable to create a separate project to hold your models.  This would allow you to distribute versioned assemblies to ease the consumption of the API.  In this case, we're just going to put the models in our API project because no one aside from us will be using them.  In the Models folder, create a new class and call it Position.cs.

namespace TeamSample.Api.Models
{
    public class Position
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public string Abbreviation { get; set; }
    }
}

Now we can go ahead and add an action method to return all of the positions in the system.  Add the following action method to the API.  We're not passing a route template to the HttpGet attribute which will make this method resolve to the /api/positions route.

[HttpGet]
public async Task<IEnumerable<Models.Position>> Get()
{
    
}

Now we've got our first API endpoint setup and ready to go.  The method is marked as async and returns a Task<IEnumerable<Models.Position>>.  This allows us to easily consume async code and keeps threads free for the host thread pool while awaiting IO operations against the database.  Next, let's look at what it will take to query the database.  We're also going to discuss and implement projections.

EF Async - Getting Our Data

By using the async methods of EF Core, we can use async/await to get our data.  There are a few methods we should use, like ToListAsync() and FirstOrDefaultAsync().  There are also some we should avoid.  The avoided methods are generally methods used when adding and updating data.  For instance, EF provides us with an AddAsync() method.  If you look closely at the intellisense documentation, you'll see that it's intended as an internal use method to support async within Entity Framework itself.  To support adding and updating, we'll look instead at async transactions and the SaveChangesAsync() method of the DbContext.

We've implemented a Get method for Positions.  This is a really simple operation with no criteria, so we're going to just pull the list of Positions from the Positions repository.  We're also going to use a projection; that is, we're going to use our select to return the data into the model we're returning instead of directly into an entity.  Add the following code to the Get method we created:

var positions = await _context.Positions
    .Select(p => new Models.Position
    {
        Id = p.Id,
        Name = p.Name,
        Abbreviation = p.Abbreviation
    })
    .ToListAsync();

return positions;

As you can see, we've used the async methods available to access the data.  We've awaited our result, projected the result into our return model through a lambda in the Select method, and called ToListAsync to execute the query asynchronously.

I want to talk a bit more about projections and why they're important.  Let's create a scenario.  Say we have the abbreviation of a particular position, and from that we want the full name.  First let's look at an incorrect example of how this could be done.

var position = await _context.Positions
    .FirstOrDefaultAsync(p => p.Abbreviation == abbreviation);

if (position == null)
{
    Response.StatusCode = (int) HttpStatusCode.NoContent;
    return null;
}

return position.Name;

In the snippet above, we query positions to find the position that matches the supplied abbreviation.  It seems to make sense.  If position is null, we set the response status code to NoContent, then we return null, otherwise we return position.Name.  The problem is, whenever we do this we're returning the entire Position object from the database.  Every column of data gets returned.  In larger tables and/or larger sets of results, this means much more traffic between the web server and the database engine.

By using projections, we can specify exactly which pieces of data we want to return.  The entity itself just becomes a contract of what data is available but we don't have to return it.  In our first example, we returned all properties of the Position object (except the navigation property).  In this case, we only care about the name.  By specifying which properties we want to project, we instruct entity framework to only query those values from the database.  Here's how we can clean up that same method:

[HttpGet("[action]/{abbreviation}")]
public async Task<string> NameFromAbbreviation(string abbreviation)
{
    var position = await _context.Positions
        .Where(p => p.Abbreviation == abbreviation)
        .Select(p => p.Name)
        .FirstOrDefaultAsync();

    if (string.IsNullOrWhiteSpace(position))
    {
        Response.StatusCode = (int) HttpStatusCode.NoContent;
        return null;
    }

    return position;
}

As you can see, we've made only minor changes.  We've moved the match criteria to a Where method call, used a Select to specify that we only want the Name column, and omitted the lambda from FirstOrDefaultAsync.  If you're not using projections with your existing queries, you should be.  This will help your data access stay leaner and more efficient.

As a bonus, along this same thread we can also handle how our API returns data to the client.  This allows us to use a smaller set of models to return a broader range of json object shapes.  ASP .NET Core MVC uses Newtonsoft's JSON .NET to serialize and deserialize responses and requests.  We can add a configuration hook to instruct JSON .NET on how to handle default values.  In fact, we can configure the serializer to not even create json properties for default values.

It's important to note that if taking this approach globally, you may want to instruct consumers of this behavior.  For some objects, it may be perfectly acceptable for a string value to be null so the client should be informed not to expect the property to exist if the value is null.

Let's take a look at how we can change this globally.  In the ConfigureServices method of our Startup.cs, the last line reads:  services.AddMvc();.  If you examine that method, you'll see that it has a return type of IMvcBuilder.  This is an object that allows us to further configure our MVC pipeline.  We're going to use a method called AddJsonOptions, which accepts an action and allows us to configure the serialization settings.  Go ahead and replace services.AddMvc() with the following which will configure json serialization options to ignore default values when serializing, but populate them when deserializing (omitting them from the response but honoring them for requests):

var mvcBuilder = services.AddMvc();

mvcBuilder.AddJsonOptions(o =>
{
    o.SerializerSettings.DefaultValueHandling = DefaultValueHandling.IgnoreAndPopulate;
});

Now let's add a new API method that allows us to retrieve a position by abbreviation.  In this case, we're going to use the same position class, but the client obviously already knows the abbreviation, so we're going to simply return the Id and Name properties.  Leaving the Abbreviation property at its default value of null will cause the serializer to skip serialization of that property entirely:

[HttpGet("[action]/{abbreviation}")]
public async Task<Models.Position> FromAbbreviation(string abbreviation)
{
    var position = await _context.Positions
        .Where(_ => _.Abbreviation == abbreviation)
        .Select(p => new Models.Position {Id = p.Id, Name = p.Name})
        .FirstOrDefaultAsync();

    if (position == null)
    {
        Response.StatusCode = (int)HttpStatusCode.NoContent;
        return null;
    }

    return position;
}

As you see, we've used a projection to return only the Name and Id properties.  The serializer will take care of the rest!  Note also that it is possible to apply these settings at the controller and/or action level as well, and I would recommend this approach when implementing non standard behavior.  Doing so is well beyond the scope of a post with a focus on Entity Framework, but if you wish to do so, take a look at the IResourceFilter interface and the ASPNET Core Documentation on filters.

Async Saving and Transactions

As mentioned earlier, while there are methods like AddAsync, we shouldn't use these.  They're made to support internal entity framework operations and the task of adding an entity to the context doesn't actually perform any IO operations.  These won't occur until we actually save changes on the context.

In the interest of brevity, we're going to omit adding additional API methods and just look at data modifications.  Let's look at a simple example of adding a team to the database.

var team = new Team
{
    Name = "Nebraska Cornhuskers"
};

_context.Teams.Add(team);
await _context.SaveChangesAsync();

That was easy enough.  The only real difference between this and any other regular Entity Framework usage is that we've used the SaveChangesAsync method and awaited it rather than relying on the synchronous and blocking SaveChanges method.  Now let's say you want to do something a little more complex.  Let's update our example to add the team and then separately add a mascot for that team.

var team = new Team
{
    Name = "Nebraska Cornhuskers"
};

var mascot = new Mascot()
{
    Name = "Herbie Husker"
};

_context.Teams.Add(team);
await _context.SaveChangesAsync();

mascot.TeamId = team.Id;

_context.Mascots.Add(mascot);
await _context.SaveChangesAsync();

Again, everything seems good here, but what happens if adding the mascot fails?  The team will be added in the database but will not have a mascot associated with it.  If our business rules say the team should not be added without the mascot, then we'll want to use a transaction to make sure all or nothing will succeed.  This allows us to catch an exception and rollback the transaction and do whatever additional logging we may (read: should) want to do.  Here's the updated code.

using (var txn = await _context.Database.BeginTransactionAsync())
{
    try
    {
        var team = new Team
        {
            Name = "Nebraska Cornhuskers"
        };

        var mascot = new Mascot()
        {
            Name = "Herbie Husker"
        };

        _context.Teams.Add(team);
        await _context.SaveChangesAsync();

        mascot.TeamId = team.Id;

        _context.Mascots.Add(mascot);
        await _context.SaveChangesAsync();

        txn.Commit();
    }
    catch 
    {
        txn.Rollback();
        throw;
    }
}

There's one more point of interested I wanted to note here.  While acquiring the transaction is an async option (BeginTransactionAsync), there is no equivalent method for Commit or Rollback.  While this may not seem intuitive, they are very lightweight calls, simply sending a command to the database and having async variants would offer no real benefit.

DbContext and Concurrency 

Before I leave you, I want to talk about concurrency.  While the async methods exist and make working with EF Core in an async manner much easier, the DbContext is not thread safe and doesn't support working with it from multiple threads.  If you need to work with Entity Framework from multiple threads, you'll have to use multiple instances of the DbContext.

If you'd like to download the completed sample, it's available here:

TeamSampleFinal.zip (23.72 kb)

And that's it!  There's obviously much more than can be done, but hopefully this have given you a good running start and introduced you to some important concepts and implementations.  I'll leave this as the final part of this series but in future posts we'll look at some other options with ASPNET Core and Entity Framework Core such as building a data layer that's mostly database agnostic and can be changed simply through configuration.  If you've found this information useful or have any questions, drop comment below and follow me on Twitter @dustinhorne.