Entity Framework Design Meeting Notes
The Entity Framework team has a weekly design meeting in which we discuss/recap design and other issues with the codebase. These are the notes from those meetings. The intention is to provide a history of what decisions have been made and why. No attempt is made to go back and update notes from older meetings if we later change a decision and decide to do something different.
August 2, 2012
CUD Batching
There have been various CodePlex discussions around adding batching support to EF. These have resulted in some useful information and ideas which are being gathered together here with ideas from the team. This then creates a starting point for the EF team and/or the community in implementing this feature.
Currently when executing SaveChanges EF produces and executes discrete DB operations for each CUD operation. However SQL Server and some other backends provide a way of sending a set of operations as a single unit which may result in better performance, especially in situations where the connection to the database has high latencies.
This is the most voted issue on CodePlex with 14 votes and a close second on UserVoice with 1492 votes.
To support batching we would need to solve two problems:
- Enable batching in the SQL Server provider
- Change the provider model to enable different batching implementations
SQL Server
There are several approaches we could consider:
- Append all operations in a single command text.
- If server generated values are present we could either
- Preserve the order of the operations and output the values in different result sets
- Alter the operations to output the generated values to a temporary table and then query that values there in a single statement
- This option can “break” plan caching
- Not clear that this will be a real problem in the wild
- Benefits of batching will likely outweigh it, especially in high-latency scenarios
- Is it possible to disable plan caching? Is it worth it?
- Might be a reason for allowing batching to be switched off
- If server generated values are present we could either
- For inserts it is possible to use a single statement (INSERT INTO … OUTPUT INSERTED … VALUES)
- We need to do some prototyping to get perf measurements to see how much of an advantage this would provide
- Use SqlDataAdapter
- It seems the only public way to do this is with DataSets which is a dependency we should not take and may not provide real perf improvements anyway
Provider Model
Other backends might be restricted to one of the above options or have a different way of achieving this. We need a flexible model to accommodate them.
If any of the operations rely on a server generated value from a different operation they usually can’t be batched together. We can deal with this by either:
- Splitting the batches in the update pipeline before sending to the provider
- This is basically not dealing with the problem, but could be a first step
- Add the notion of temporary keys to command trees
- This requires the provider to understand temp keys and key propagation, which is non trivial
- Add client key generation strategies: GUID, sequences, hi-lo, etc.
- When using key generation strategies the key propagation is handled by the state manager before sending anything to the provider which means that we would not need to split the batches
Different providers may have different limits on what they can batch. This means that the provider must be able to split the batch independently. Some options:
- Send one update at a time to the provider.
- The provider may choose hold onto the update or send it to the database batched with previous updates.
- If it sends the updates then it will return information back to us in terms of multiple result sets.
- We will also tell the provider when we have no more updates to send so that the provider can finish the last batch.
- Send one update at a time to the provider with the provider using an event to give information back when it decides to send the batch to the server.
- Send all updates to the provider at one.
- This could be a push from EF to the provider or allow the provider to pull updates from EF
- The provider returns one or more data readers with multiple result sets back to EF
Open questions
- Where MaxBatchSize should be exposed?
Possible classes for one-at-a-time approach (async version):
abstract class DbBatchCommand : IDisposable { DbTransaction Transaction DbConnection Connection int Timeout DbBatchResult AddAndExecute(DbCommandTree) Task<dbbatchresult> AddAndExecuteAsync(DbCommandTree) DbBatchResult Execute() Task<dbbatchresult> ExecuteAsync() } class DbBatchResult { bool Executed bool HasReader int RowsAffected DbDataReader Reader } abstract class DbProviderServices { DbBatchCommand StartBatch() }
Bulk operations
There have also been CodePlex discussions about bulk operations. The idea here is to improve the perf of updates and deletes by providing a way to perform many of them in the server without first having to bring the entities into the context. Some suggestions for how the code might look are:
var user = new UserInfoFields(); var update = user.Update().Set ( user.Field1 = 1, user.Field2 = "xxxx" ).Where(user.Name == "Jim" && user.Enable); update.Execute(); context.Employees .Where(e => e.Title == "Spectre") .Update(e => new Northwind.Employee { Title = "Commander" });
There are two important questions that need to be answered about a bulk operation implementation:
- Should calling the API cause the updates to happen immediately or should they happen when SaveChanges is called?
- Deferring until SaveChanges is called at first seems to match existing EF behavior. However, SaveChanges is currently only concerned with writing changes that have been detected in tracked entities. For bulk updates to happen as well the state manager would have to track which bulk operations are also pending, which is a significant change in both mental model and implementation.
- In addition, the APIs feel like they should send updates immediately, so deferring until SaveChanges could be unintuitive.
- Decision: Use immediate execution
- Should performing a bulk update affect local entities being tracked by the state manager?
- If local entities are not touched then it would be very easy to have the state manager get out-of-sync with the database resulting in unexpected behavior on subsequent uses of the context. In other words, it would be easy for people to shoot themselves in the foot.
- The problem is that it is not in general possible to know exactly what changes will be made to the database such that these changes can be reflected.
- This could be due to the database being out of sync with the context or because of semantic differences in how we interpret the query compared to how the database interprets it—for example, string compare differences.
- Decision: While it is not possible to be sure that the local changes will exactly match the database changes it seems that we may be able to get close enough to avoid most foot-shots. We should aim for this.
July 26, 2012
Async
- Should we move all public async methods to a separate namespace as extension methods so they don’t pollute Intellisense?
- Decisions:
- We will leave async methods in S.D.E. primarily to reduce the number of namespaces that people will need to import.
- Async versions of methods that LINQ to Entities does not support will be removed.
- Decisions:
- Should we remove the Create methods from IDbAsyncQueryProvider and derive it from IQueryProvider?
- Decision: Yes
- I changed Database.SqlQuery to return DbSqlQuery so that it could be enumerated asynchronously (without tracking) and DbSet.SqlQuery will now return DbSqlSetQuery that derives from DbSqlQuery.
- Is this the right hierarchy and naming?
- Should we provide singleton async methods on DbSqlQuery, since it doesn’t implement IQueryable?
- Decisions:
- We should not change the name of the existing class—unnecessary breaking change.
- We need to find a name for the new class—probably will be done as part of polishing later
- We should add instance async methods in the same way that we did for AsNoTracking
One-pagers
- Async feedback
- Current draft looks good; don’t need to list out all API
- General goals/format
- The one-pager should remain very high level and provide a short overview of the important points about the feature.
- The audience is the community who wants an overview of the current state of the feature.
- The format and sections are not rigidly defined and should reflect the important things the community would need to know. For example, it might include:
- Goals
- Non-goals
- Dependencies
- Design
- API Usage
- Challenges
- Limitations
- The document should be periodically updated as development progresses.
- It will form the basis of a blog post when the feature is implemented to the level that we want to illicit wider feedback from the community.
- Note that this is not an up-front design document or a comprehensive spec. It’s just a summary of where we are at with the design.
Initializer proposal
A potential contributor has started a discussion about adding a new initializer to our lineup. The initializer would be used when you don’t ever want EF to create a database (even if it doesn’t exist) but do what EF to do a check that the database exists and that the model matches and throw if not. This allows for a fail-fast in al cases where the database is not up-to-date. It should probably also fail if model information cannot be found in the database.
Two questions:
- This seems like a useful initializer but should it go into EF.dll or is it better off in contrib or elsewhere?
- Decision: The idea has merit for the core; i.e. we are interested.
- Should it also fail if model information cannot be found?
- Decision: Yes; it’s primary a fail fast checker for people using Migrations.
- Related: The IoC work requires a NullDatabaseInitializer for disabling database initialization (since null means something else). Should this be public?
- Decision: Yes, it can be made public.
- Decision: Yes, it can be made public.
July 12, 2012
Model key caching
When discovering a Code First model DbContext has to decide whether to use a cached model or run OnModelCreating and the rest of the pipeline to create a model. In EF5 the key used for this cache lookup is a tuple of derived context type and provider invariant name. However, sometimes the same context type and provider need to be used for multiple models, such as when the using different schemas for multi-tenancy databases. Allowing the cache key to be injected makes this possible without the need to do custom model building and caching.
The implementation of this makes use of the dependency injection work as follows:
- Key abstractions identified:
- IDbModelCacheKey
- (Equals, GetHashCode)
- IDbModelCacheKeyFactory
- Create(DbContext)
- IDbModelCacheKey
- Create default implementations
- DefaultModelCacheKey etc.
- “Invert” control
- Prefer .ctor injection
- Can also use “poor man’s DI” to aid testing:
public LazyInternalContext(IDbModelCacheKeyFactory cacheKeyFactory = null) { _cacheKeyFactory = cacheKeyFactory ?? new DefaultModelCacheKeyFactory(); }
- Go directly to the resolver if injection impractical
Notes:
- Not all dependencies that can be injected need to be exposed explicitly in DbConfiguration. In this case a property was added but we have now decided to remove it. (Work item 373.)
- The schema is not included into the cache key by default because we would need to run OnModelCreating to get it, and this has perf implications for large models.
- We could add API to make the schema available without running OnModelCreating, but on balance it seems like we can instead make people aware of how to do it using the mechanism above without adding surface
- We should blog on how to do this.
- We need to document which interfaces/base classes the EF will try to resolve (Work item 374.)
Migrations history table schema changes
Background:
- DbModelBuilder now has a HasDefaultSchema() method to allow the schema that is used in the created model to be changed.
- Ideally, this should also affect __MigrationHistory table so that
- it co-located with the other tables in the schema
- and so that multiple history tables can exist in the same database
- We then need to be able to migrate the history table along with the rest of the model
- We probably want to allow further history model configuration in EF6 (table name etc.)
Current implementation:
- Propagate default schema to history context
- Include history model metadata in user metadata (transparently)
- Create/Drop history becomes part of the standard pipeline.
- Introduce “IsSystem” annotation so we can identify our metadata.
- Can’t really rely on names
Issues:
- Handling default schema changes is tricky:
- “foo” -> “bar”, (Add|Get)-Migration fails because history is in “foo” but we look in “bar”
- For explicit migrations we could successively try schemas from the code behind metadata.
- But auto-migrations don’t really work because the current DB metadata is only in the history table!
- Use Info Schema?
- Could still find multiple history tables, which one do we use?
- By design we don’t reflect into the DB from Migrations but we could change that.
- Use Info Schema?
- “foo” -> “bar”, (Add|Get)-Migration fails because history is in “foo” but we look in “bar”
- Existing apps metadata doesn’t contain history metadata.
- Use IsSystem in the differ to avoid producing false diffs.
- When updating, inject history metadata into first user migration (in memory) if not present
SQL Server system objects cannot be moved
- Do a table rebuild (SELECT * INTO foo.Bar FROM dbo.Bar)
Notes:
- Adding the history table metadata to the user metadata is fine, but we should try to keep this a valid EDMX documentation to ensure we (or others) can easily parse and understand it in the future. (Work item 375.)
- Is IsSystem enough or do we need to go with something more specific, such as IsHistory
- IsSystem is okay as long as it works. Depending on what we do with the system-ness of the history table we could change it.
- We will likely need to use IsSystem in conjunction with c-space names to ensure we can always find the correct metadata.
- Putting history info in a different container was considered but would require considerable work in other areas, such as the model differ.
- How do we find the history table after the schema changes?
- For explicit migrations we will look in the code-behind.
- For automatic migrations not only can we not find it but we can’t distinguish the case of not finding it from the case of it not being there because the database is new. This could cause Migrations to re-create all the tables in a new schema while all the existing tables still exist (with data) in the old schema.
- Options:
- Considered: Make it so that automatic migrations can only be used with the default schema
- Considered: Provide some kind of API that forces users to declare a schema change explicitly
- Decision: Make it so that changing the schema doesn’t automatically move the history table as well; you would have to do this with a manual step
- Should the history table continue to be a system table?
- Majority in the room believed that it should be
- Majority believed that we should not add special code to allow this to be changed in EF6 (Current ways of changing it are okay.)
- Code must continue to work with either a system table or a normal table
- Migrating the history table to a new schema must account for the table-rebuild this requires
Async immediate LINQ operators
How should async operators like FirstAsync and CountAsync (that return single values instead of queries) work if the IQueryable is not our IQueryable?
- If we throw it makes it a bit harder to mock
- If we don’t throw it could look like these things work with other LINQ providers when they don’t
- Decision: Throw and make sure that mocking can still be done. Document as necessary.
July 5, 2012
Refactoring and unit tests
We have been doing quite a lot of refactoring in the core code recently. This is generally good, but is risky due to the lack of test coverage for the core. To mitigate this risk we should:
- Provide a good description of the refactoring as part of the code review
- The description should include some information on how the refactoring was tested
- The description should be included in the commit comment so that it can be understood later
- More generally, we need to make sure we provide good descriptions for all code reviews and that these descriptions get committed
- This is essentially the same process we had in the old team
- Make sure that any core code you touch is tested
- If possible, write tests before refactoring
- If the refactoring is to permit testing then make sure the code is well tested after the refactoring
June 28, 2012
Code contracts
Background
Currently we are using code contracts just for runtime verification of preconditions and don’t build the reference assemblies.
Any users that derive their types from the types defined in EF that want to use code contracts won’t be able to access the contracts that we defined.
Options considered
- Ship the reference assembly in the same NuGet package as the main assembly
- This would unnecessary bloat the package for the users that don’t use code contracts. The reference assembly is about half the size of the main one.
- It is possible to use explicit assembly references in the package so the reference assembly wouldn’t be added as a project reference as it’s not used during runtime.
- Currently we don’t specify postconditions, so the reference assembly would be of limited value even to code contracts users.
- Ship the reference assembly in a separate NuGet package
- This wouldn’t bloat the main package, but would be less discoverable and still have all the other cons listed above
- Don’t ship the reference assembly, but build it
- Users with access to the source code will be able to build it when needed.
Decision
We won’t ship the reference assembly unless we get some requests to do otherwise. However we need to improve the contracts at least on the public surface and consider enabling static verification.
June 21, 2012
Code-based configuration
Background
In EF 4.1 most configuration was done through code. In EF 4.3 we added the EntityFramework configuration section to allow configuration via config file.
Both code-based configuration and file-based configuration are useful. Code-based configuration can make use of IDE and compiler services (strong typing, Intellisense, etc.) and is flexible especially when coupled with dependency injection. File-based configuration can allow the same code to run in different environments without re-compiling.
The main problem with code-based configuration is making sure that the configuration is available to design-time tooling that does not run the application. The tooling must be able to find and execute (or otherwise interpret) the code. This is not possible if the configuration is performed by some arbitrary call made at some point during app startup.
Goals
- If I don’t know about or care about using code-based configuration, then everything still works
- In particular, EF 4.3 config file configuration is not changed
- If I do want to use code-based configuration then it should be simple to do for the main developer scenarios
- If I use it in this simple way, then tooling will also be able to find and use my code-based configuration
- Whatever we add now should also form the basis for adding new configuration going forward
- Less common scenarios (no context, multiple contexts, using Code First building blocks, etc.) should still be easy with code-based configuration
- These scenarios may require additional steps but it should be easy to find out what the steps are
Basic Idea
We provide a DbConfiguration base class. To use code-based configuration a developer creates a class derived from DbConfiguration and places it in the same assembly as their context. Configuration settings are made in the constructor of this class.
public class MyConfiguration : DbConfiguration { public MyConfiguration() { SetDefaultConnectionFactory(new LocalDbConnectionFactory("v11.0")); AddEntityFrameworkProvider("My.New.Provider", new MyProviderServices()); } }
Additional details
- Can I specify the configuration type to use in the config file?
- Yes. This overrides discovery and allows your DbConfiguration class to be contained in any assembly.
- Design-time still works in this case because the config can (and should) be made available to the tooling
- What if I run some code that needs to use configuration before I use my context type?
- This just works if you’re not using code-based configuration or if you have specified your DbConfiguration class in the config file.
- If you are using code-based configuration then you must set the DbConfiguration to use explicitly:
DbConfiguration.Instance = new MyConfiguration();
- If you don’t do this then we will throw when you use your context and we discover that you have a DbConfiguration class but didn’t set it.
- We will also throw if you set the DbConfiguration to something we can’t discover.
- What if I want to use EF without a derived context type at all?
- This is the same as the previous bullet point except that we will never actually do the discovery
- There is an assumption here that tooling will always make use of a derived context.
- What if I have multiple contexts in multiple assemblies?
- The easiest option is to specify the DbConfiguration class in the config file.
- We may also choose to allow a special type of DbConfiguration class that just acts as a proxy to a class in another assembly.
- What if I have a context that shouldn’t impact application configuration?
- A good example of this is the HistoryContext used by Migrations. Using this context shouldn’t affect DbConfiguration resolution—it should just use whatever configuration the application is using.
- If this context is in the same assembly as the DbConfiguration you want to use then it’s not a problem.
- If it’s in a different assembly then you can put a type derived from DbNullConfiguration in this assembly. This tells EF to ignore DbConfiguration discovery for contexts in this assembly. In particular, it tells EF not to throw if a DbConfiguration is discovered in both this assembly and another assembly.
- How does this work with dependency injection?
- DbConfiguration is actually the place where the IDbDependencyResolver chain is rooted.
- All configuration settings are resolved using the resolver chain.
- When a configuration value is set this is implemented by adding a new resolver to the chain.
- You can also add your own resolvers directly when constructing the configuration
- Can I mutate the code-based configuration after it has been set?
- Not directly because it encourages code that will be different when run in the application than when run at design-time.
- The setter methods are protected to encourage usage from the constructor.
- Once the configuration is set (either implicitly or explicitly) then it is locked and further attempts to modify will throw with info on the correct way to use DbConfiguration.
- However, you can add a dependency resolver that can have behavior that changes as the application runs.
- This doesn’t pose as much of a risk for design-time since the resolver is still added and must function in some way at design-time.
- I’m currently using this in the functional tests to change the DefaultConnectionFactory to target SQL CE or LocalDb for some tests.
- What if I set some config in both code and using the config file?
- Config always wins.
- The code below shows a CompositeResolver that ensures dependencies are always resolved from the config before other resolvers are tried.
- What happens to the existing code-based configuration APIs?
- We will obsolete (or remove) SetDefaultConnectionFactory
- We may obsolete SetInitializer, but this will impact a lot of people.
Comments/suggestions from the meeting
- Consider having NuGet package generate a DbConfiguration class instead of creating/updating config
- This would avoid the potential confusion of some config we create overriding config the user then sets
- But it is harder to parse/update DbConfiguration code—for example, to switch connection factory
- Also not clear how other packages would update/add to this code
- Consider allowing packages to create code snippets that are collected together
- No clear idea on how this would work
- We should update existing APIs to allow their dependencies to be injected
- We could in the future then try to remove the non-context based use of the configuration, but this would be a lot of changes to existing APIs
- Alternately we could throw if non-context use happens and the config is not specified in the config file
- Either way, we should understand which APIs currently use configuration
- We will implement this as is for now, then iterate on it
Current code for DbConfiguration
public class DbConfiguration { private readonly CompositeResolver<resolverchain resolverchain ,> _resolvers = new CompositeResolver<resolverchain resolverchain ,>(new ResolverChain(), new ResolverChain()); private bool _isLocked; protected internal DbConfiguration() : this(new AppConfigDependencyResolver(AppConfig.DefaultInstance), new RootDependencyResolver()) { } internal DbConfiguration(IDbDependencyResolver appConfigResolver, IDbDependencyResolver rootResolver) { _resolvers.First.Add(appConfigResolver); _resolvers.Second.Add(rootResolver); } public static DbConfiguration Instance { get { return DbConfigurationManager.Instance.GetConfiguration(); } set { Contract.Requires(value != null); DbConfigurationManager.Instance.SetConfiguration(value); } } internal void Lock() { _isLocked = true; } internal void AddAppConfigResolver(IDbDependencyResolver resolver) { Contract.Requires(resolver != null); CheckNotLocked(); _resolvers.First.Add(resolver); } protected void AddDependencyResolver(IDbDependencyResolver resolver) { Contract.Requires(resolver != null); CheckNotLocked(); // New resolvers always run after the config resolvers so that config always wins over code _resolvers.Second.Add(resolver); } [CLSCompliant(false)] protected void AddEntityFrameworkProvider(string providerInvariantName, DbProviderServices provider) { CheckNotLocked(); AddDependencyResolver(new SingletonDependencyResolver<dbproviderservices>(provider, providerInvariantName)); } [CLSCompliant(false)] public DbProviderServices GetEntityFrameworkProvider(string providerInvariantName) { // TODO: use generic version of Get return (DbProviderServices)_resolvers.Get(typeof(DbProviderServices), providerInvariantName); } protected void SetDatabaseInitializer<tcontext>(IDatabaseInitializer<tcontext> strategy) where TContext : DbContext { CheckNotLocked(); AddDependencyResolver(new SingletonDependencyResolver<idatabaseinitializer><tcontext>>(strategy)); } public IDatabaseInitializer<tcontext> GetDatabaseInitializer<tcontext>() where TContext : DbContext { // TODO: Make sure that access to the database initializer now uses this method return (IDatabaseInitializer<tcontext>)_resolvers.Get(typeof(IDatabaseInitializer<tcontext>), null); } public void SetDefaultConnectionFactory(IDbConnectionFactory value) { CheckNotLocked(); AddDependencyResolver(new SingletonDependencyResolver<idbconnectionfactory>(value)); } public IDbConnectionFactory GetDefaultConnectionFactory() { return Database.DefaultConnectionFactoryChanged paggma warning disable 612,618 ? Database.DefaultConnectionFactory pragma warning restore 612,618 : (IDbConnectionFactory)_resolvers.Get(typeof(IDbConnectionFactory), null); } private void CheckNotLocked() { if (_isLocked) { throw new InvalidOperationException("Configuration can only be changed before the configuration is used. Try setting configuration in the constructor of your DbConfiguration class."); } } }
June 14, 2012
Making database connections in the test suite more configurable
Currently the vast majority of the EF6 tests only run against SQL Express. Some others are setup to also run against SQL Compact or LocalDb. In order to check things are working as expected against other backends it would be good if we could configure most tests to run against any type of database. The initial requirements for this are:
- Check-in tests and main CI build should still run against SQL Express to provide fast check-in bar and CI feedback.
- The run against other backends must happen frequently (at least once a day) but could be either triggered or scheduled.
- We will still need some tests directed towards a specific backend (e.g. SQL Compact) where those tests are specifically designed to test behavior that is different for different database types.
- In other cases the tests may be the same, but with backend-specific assertions such as is currently implemented for the Migrations tests.
- The default for any new tests should be to run against any backend.
- Attributes should be used to exclude a test from certain backends.
- It should also be possible for assertions to be backend specific, as stated above.
- The correct provider manifest token should be used for the backend in question. For example, it would be to use the 2005 manifest token when testing against 2008, but this should not be done.
We will schedule this work item for EF6. Some ideas on implementation:
- As much as possible only functional tests should hit the database. Some unit tests currently hit the database—these should be investigated and changed/moved appropriately.
- The Migrations/XUnit infrastructure is a good place to start for this, but will need significant modification to meet the requirements.
June 7, 2012
Naming convention for static readonly and const fields
Decision:
- All constants are pascal cased (e.g. PascalHasMoreHumps)
- All fields are underscore camel cased (e.g. _humpLikeACamel)
- Any public fields will be made into constants or encapsulated
How will we handle breaking changes in EF6?
When considering whether or not to make a breaking change in EF6 or later we will consider the overall customer experience, both long term and short term. In particular if the short-term impact of the change is small and the long term benefit is great, then we will take the change. A “small” short-term impact usually means one or more of the following:
- The change will not affect people except in very corner cases
- The change causes an immediate and easy-to-fix build break such that the chance of it causing a production bug are small
- The change is really a bug fix and so fixing it will make more applications work correctly rather than break those that depend on the broken behavior
A breaking change that we are unlikely to take (without a flag to switch the new behavior on) would be one that breaks runtime behavior (rather than the build) in a subtle and/or difficult to fix way.
We are more able to take breaking changes in future versions of EF because we will not be releasing in-place updates and are making use of semantic versioning to signal significant breaking changes to consumers of our assemblies.
We will mark work items/bugs that result in changes breaking changes so that we can release breaking changes.
High-level ideas for using dependency injection with EF and specifics for the provider
Specific, current problem:
- How can we get the EF provider from the config when the config has been overridden using DbContextInfo?
- Many places that need the provider are not coupled to DbContext or DbContextInfo
- Adding coupling to the context due to the dependency on the provider smells bad
More general problem:
- We need a way to resolve dependencies (such as the EF provider) such that setting how the dependency is resolved is decoupled from uses of the dependency
- In other words, we need an inversion-of-control (IOC) or dependency injection (DI) container
High level design:
- We don’t want to be strongly coupled to any one DI container
- Don’t want the binary dependency on the DI assembly
- Don’t want to limit people to using a specific DI container when they may already be using and/or prefer another
- We can follow the MVC model of having an IDependencyResolver interface into which other DI containers can be plugged
- Learn lessons from MVC—for example, provider a way to release dependencies using the Release method
- We need the ability to resolve by both CLR type and name—for example, the provider invariant name
- it might look something like:
public interface IDbDependencyResolver { object Get(Type type, string name); void Release(object service); }
-
- Open questions following the design meeting:
- Do we want additional overloads of Get that take just the type and/or just the name?
- How does taking the type and the name work when plugging in various DI containers?
- Note that we will provide generic extension methods to avoid the need to cast.
- Open questions following the design meeting:
- Provider an app domain wide registration point for the IDependencyResolver instance to use
- We will set a default resolver that is used if you don’t know/want/need to use your own container
- Will use a Chain of Responsibility pattern to allow dependency resolution to be overridden per dependency
- Open issue: we need to figure out how this effects design time scenarios and DbContextInfo
- We should look at using attributed methods similar to those that ASP.NET use
- We could use the equivalent of a configuration class like we have for migrations
- Look at using ServiceLocator or the equivalent
- API might be something like:
public static class DbDependencyResolver { public static IDbDependencyResolver Root { get { ... } } public static void Add(IDbDependencyResolver resolver) { ... } }
-
- We decided not to provide a setter for the Root. There is no real need to change the root as opposed to adding a new resolver to the chain and we can simplify the code that uses the chain if we always know that the last one in the chain will be our resolver—for example, we can make the assertion that some dependencies will always return a value and will not be null.
- Possibly we don’t need to expose Root but rather just expose methods—for now we will expose Root
- Based on decision for DbContextInfo we will probably need a Remove method to remove a resolver from the chain.
- Internally, we will change places that have hard-coded dependencies to allow their dependencies to be injected
- Hard coded dependencies may be use of new or access to a Singleton
- For public surface that implicitly uses a hard-coded dependency we will use the app-domain wide resolver
- We will probably also provide public surface for the injected dependency
- Depending on the scope of dependencies that part of the code needs we may choose to inject an IDbDependencyResolver or the contract for the specific dependency
- Using IDbDependencyResolver allows multiple independent dependencies to be injected together and allows new dependencies to be added in the future without changing the API
- Injecting the specific dependency is better where it specifically needed by the code in question
How this solves the specific problem:
- DbContextInfo adds a new dependency resolver to the default chain. This updates the app-domain wide configuration to use dependencies from the specified config:
DbDependencyResolver.Add(new DefaultDependencyResolver(_appConfig));
- We also looked at not changing the app-domain but instead having it configured onto the context and then flowed through to everywhere we use it:
var extendedResolver = new ResolverChain(DbDependencyResolver.Root); extendedResolver.Add(new DefaultDependencyResolver(_appConfig)); context.Resolver = extendedResolver;
- The former has the advantages of simplicity for most of the stack and consistency for all the code no matter where it gets the root resolver from
- The latter has the advantage that the DbContextInfo only sets the new resolver for the scope that it is in use for without changing the whole app-domain. However, uses of DbContextInfo are currently very limited and most based around app-domain modification anyway. We will provide a way for the DbContextInfo to remove its modification to the app-domain so that the changes can be scoped if needed.
May 31, 2012
Making EF code more testable
Currently many EF classes are sealed and/or have non-virtual methods. This makes it hard to create mocks for these classes. Options:
- Use new mocking capabilities in .NET 4.5/Dev11
- Pros:
- No need to change existing classes just for testing
- Well-defined public inheritance is retained as recommended by Framework guidelines
- Cons:
- Will tie us to .NET 4.5 which will be problematic for testing against .NET 4.
- Doesn't act as a forcing function for generally improving design--e.g. introducing seams
- Pros:
- Create wrapper classes so we can mock internal types
- Pros:
- No need to change existing classes just for testing
- Well-defined public inheritance is retained as recommended by Framework guidelines
- Cons:
- Introduces an additional layer of indirection which is potentially not needed
- Doesn't allow the more mockable types to be used by customers
- Pros:
- Unseal classes and add virtual methods
- Pros:
- Allows customers to mock EF types more effectively as well as making it easier for us
- Doesn't change the amount of code/indirection we have or tie us to .NET 4
- Can still add internal classes for factoring where we want to change class responsibilities/design without breaking existing public surface
- Public inheritance can be used in places we didn't anticipate it
- Cons:
- Goes against Framework guidelines in that public inheritance can now be used in places where the code doesn't anticipate it resulting in strange behavior in EF
- Pros:
Open questions:
- Should we remove wrapper classes that have already been added?
- Yes
- Should we go through and make everything virtual in one go?
- The problem with just making everything virtual and constructible is that it doesn't help introduce seams into the code that would allow appropriate dependency injection for people to use to substitute their own implementations or mocks
- In the future we need to address this and move to a more open architecture
- For now we will just use internal constructors and make methods virtual on a class-by-class basis as needed
- We will look at introducing abstract base classes or interfaces for publicly interesting classes
- If we need to jump through hoops to mock something then we look at introducing seams and using dependency injection and consider making it public
- What do we do for mocking of classes that we don't own?
- We should not use conditional compilation here
- We can wrap external classes in proxies and use refactoring to ensure that the proxies are used internally while the public surface doesn't change.
Code duplication in Async
It is not possible to use the same code for async and sync versions using normal mechanisms of re-use because of the way the compiler re-writes the code. This leads to some code duplication.
Possible solutions:
- We could use T4 templates or conditional compilation. We decided not to do this because it adds complexity/overhead to the build which doesn't seem like a good tradeoff for the amount of duplicate code that is removed. Also, right now the async and sync methods are quite similar, but they will likely diverge as we implement more features at which point the value of the T4/conditionals decreases.
- We could call the async method from the sync version and block. We won't do this because it will very likely have a large perf implication.
- We will factor out non-async parts of the methods where we can and as appropriate and add documentation to the code to make sure people know that there are two versions of the method to be changed.
Supporting MVC scaffolding
Problem
The MVC scaffolding code makes use of types defined in the .NET Framework. This means that it will not work in EF6 when these types are pulled out of into EF.dll.
Details
The MVC scaffolding code uses Dynamic Data. However, it doesn’t use the EF model provider built in to Dynamic Data but rather passes in a new EF model provider contained in the MVC assembly:
MetaModel metaModel = new MetaModel(false);
metaModel.RegisterContext(
(DataModelProvider) new EntityFrameworkDataModelProvider(this._contextType, modelAssemblies));
Given this information it seems to me we have a few options:
- We could add a method somewhere in EntityFramework.dll that would return a Dynamic Data model provider. We could essentially copy the code for this from MVC for EF5 to reduce the risk associated with adding this code. MVC could then call this method anytime
it has a DbContext and this will continue to work when we update to EF6.
- Pros: Makes MVC4 work with EF6; relatively low risk
- Cons: Dependency on dynamic data; adds public surface that we probably don’t want
- The Dynamic Data model provider in MVC could be updated to work entirely by Reflection against .NET or EF6 types
- Pros: Makes MVC4 work with EF6; doesn’t introduce new dependencies into EF5
- Cons: From looking at the code in MVC4 this seems like quite a lot of work and it would be easy to get it wrong, especially since we are still in the early days of EF6
- We could implement a full MVC scaffolder in EntityFramework.dll or a separate assembly in our NuGet package
- Pros: Makes MVC4 work with EF6; gives us control over the EF scaffolding so we can improve it when needed
- Cons: Very high risk for EF5; EF takes dependency on MVC which seems wrong
- We could do nothing now and then release an MVC scaffolder for EF6 with EF6 or as part of the MVC tooling update
- Pros: Lowest risk for Dev11; gives us time to get the scaffolder right for EF6
- Cons: MVC4 won’t work with EF6 without pulling in a new scaffolder; I’m not sure how easy it is to integrate a new scaffolder into MVC
Decision:
- We will update the EF scaffolder as part of the tooling update which will happen before EF6 goes RTM
- We can release an EF6 scaffolder for people using pre-release versions of EF6; this will no longer be needed once the tooling updates