Powering Microsoft Dynamics NAV  2013 on AWS ­ 2

. August 10, 2016 .

 In the earlier article we explored active directory solutions and two­tier architecture for Dynamics NAV 2013 on AWS. In this article we will be exploring High Availability SQL setup on  EC2 for a production Dynamics NAV implementation. 

SQL Server Cluster for Dynamics NAV    

This was the trickiest part of the entire architecture.  Here are the choices we went through:

RDS based SQL Server  

We couldn’t have gotten any simpler than RDS as the default choice for MS SQL server,  given our MySQL payloads are also performing very satisfactorily on RDS. However,  there exists a key stumbling block which led us against RDS for Dynamics NAV.  Dynamics NAV, for better or worse depends on ​db_owner​ permission on the ​master   table of the SQL server instance. RDS however doesn’t let anyone tamper master table  permissions, including super user. This is a big bummer scenario which led us to  end­of­road of our RDS considerations for Dynamics NAV . 

EC2 SQL Server Enterprise AMI on EC2  

This was a peculiar behaviour from Dynamics NAV. EC2 SQL Server AMI ships with an  already installed SQL server instance. As it usually goes with SQL Server installations, it  is named as ‘​MSSQLSERVER​’. It is said that in Computer Science, naming is one of the  most difficult things to accomplish. Sadly, Dynamics NAV takes this quote to next level  by making sure your setup doesn’t go through if your database is named after the  default, ​MSSQLSERVER​ instance.  This, kinda makes sense because Dynamics NAV doesn’t even want to touch the default  SQL server installation with a 10­foot stick. Sadly, this was also where the EC2 AMI fell  short. We could have promptly proceeded with adding one more SQL Server instance in  the same node and be done with our business. However, the AMI has no installation  media bundled to add a SQL server instance and AWS thinks somehow this is a brilliant  idea. If you are somehow very patient and are ready to go back and forth with AWS  support, ​you can request Amazon to provide SQL server installation media.​ However, we  are not convinced about licensing and audit implications of this route.  

Using a vanilla Windows 2012 R2 AMI for SQL Server 2012  

The only pain point of using a vanilla Windows 2012 R2 instance is taking care of the  initial installation, which can be automated easily through chef. The advantage of going 
ahead here however is manifold; you will not be billed for SQL license additionally and  instead you pay only for what you use from AWS: the EC2 node. Of course, you will be  getting an added advantage of having complete say over the SQL features you want to  toggle and additional database instances that you want to install. Also, not being  locked­in on a special EC2 node due to portability of SQL license is always a  good­to­have feature. 

MS SQL Cluster High Availability Setup on EC2

Eventually, we ended up with a 2 node SQL High Availability cluster with Manual Failover.  Setting up a Master­Replica for SQL Server by hand has a lot of caveats and requires you to  jump through hoops. We decided to go for a Master­Slave configuration here or in SQL Server  terminology, a High Availability setup. Here is a short gist.    While we haven’t figured out a way to automate entire HA Cluster setup, we did enable the  failover clustering feature through a simple ​windows_feature​  block in the Chef recipe.  

Few things to consider while going for SQL HA Cluster on AWS:  

  • Go for R family as you are going to need ample memory here along with decent CPU.  These are what AWS suggests by default for their database families too.  
  • Make sure to setup SQL Agent jobs to backup transaction logs and full DB backups to  S3.  
  • Be aware of CAL Licensing terminology where a customer is billed ‘per­core’ unless you  have a lot of spare money to burn. You will end up buying a ​CAL License + Core license for half the number ​of cores on your main server. This can be replicated on one more  machine at max per Microsoft terminology. It is absolutely OK to spend some extra time to get the licensing verified before going in for the big purchase considering potential  audits in future.  
  • Make sure to setup automatic failover with an Availability Group Listener. If you  configure SQL Server for automatic failover, chances are that at an unwanted period  your DB listener might failover and this can lead to needless consequences if you are  not using 2 same capacity nodes.
  • Be aware of security ports exposed for your SQL machines. We can’t stress this point  enough. Don’t expose any port to public as there are enough brute­force attacks from all over the internet on port 1433 or whatever SQL port you are using to make you sleep  deprived just thinking about the exposure. ​Go with the thumb rule of opening only the bare minimum until something breaks. 

Common Issues of SQL Server with Dynamics NAV:

  • NAV Desktop/Web client refuses to start due to incorrect privileges on the database. The  Active Directory user powering SQL server service needs to have ​db_owner​ permission  on the master database along with application database.  
  • Make sure SQL Server Browser service is running for NAV to be able to talk to SQL  server.  
  • Make sure firewall is configured correctly in Windows machine. Since we can restrict  ports directly in AWS security group, you may want to keep Windows firewall to bare  minimum to avoid double configuration headaches.  

Setting up a read replica: Once HA Cluster is setup for SQL, read only access to the secondary DB can be configured easily so that your analytics/reporting teams can work with the live data without slowing down your primary  database.  

Monitoring   

Because a bulk of our systems are Linux based and even our monitoring hosts are Linux based,  we wanted to go for a monitoring system that reports to a Linux node.  

Sensu + Uchiwa for Monitoring 

First order of business was to monitor NAV and SQL services on Sensu. We went a step ahead and integrated alerts to PagerDuty so that DevOps team gets alerted via call/SMS whenever either of these services are down as this can mean delay in the supply/chain if not mitigated  quickly.   

Secondly, we implemented a plugin on top of Uchiwa for restarting services on the fly without having to do the whole RDP dance routine. While implementing this, we went for a simple 
client/server mechanism to remotely trigger a service restart. This could be better handled by  WinRM. 

Deployments   

Deployments that don’t involve schema level changes are automated. Developers have to  upload their changes in the form of a ​.fob ​binary on a S3 bucket. A SNS notification that is  configured for upload events pings a Flask based server running on Docker. This talks to a  secondary NAV server via ​winexe ​utility and performs download and deploy of the ​.fob​  file  on the secondary NAV server.  

What’s Next?   

Offloading NAV job queues 

For better performance, we use an additional NAV node which has the sole role of running all  the NAV job queues. This is done to minimise load on the main application server. Make sure  you don’t install Dynamics NAV web client on this server as you wouldn’t want the same data to  be overwritten from multiple places.   

WinRM Migration 

Instead of running hooks on top of Ruby/Python which run natively on Windows, we are trying to  explore native SOAP based ​WinRM API​, which is bundled from Windows 2012 onwards and  enabled in EC2 nodes by default. This will open a lot of new avenues for further automation in  monitoring, deploy and setup. Currently, the Python Library ​pywinrm​ makes it a lot easier to  interact with WinRM.   

Version Controlled Deployments 

As of now, we are uploading a binary to S3 which gets automatically deployed on NAV. A big  caveat of this approach is that features and bug­fixes are at the whims of developers who are  often from third party vendors. Hence business is often opaque about changes that are going  live. NAV supports .txt deploys where the .txt file needs to be built and compiled. Once version  controlled deploys go live, we can trigger a deploy directly from Bitbucket commit by doing  repository polling or pipelines. 

 

This article has been authored by Ganesh Hegde, a DevOps engineer at Urban Ladder.  While not fiddling with code and systems Ganesh loves to pretend he is on a diet. Ganesh has a penchant for dark humour.

Tagged Under

Tech