Fix non-standard permissions on public role

I recently ran into a situation where a piece of software we were implementing had poorly implemented security.  That is to say – no security.  All permissions on all objects were granted to the public role in the database.  This may be acceptable for a small company without a DBA (not really, but ostrich security never hurt right? /s) ; my employer cannot accept that level of risk, so I was tasked with fixing the problem.

At it’s core, it’s a simple fix, right?  Create a new role, push the permissions to that role, then revoke those permissions from public.  Unfortunately, the database in question has enough objects that doing a manual comparison an migration is impractical.  Time to dig in and write a script to fix it.  Here is what I came up with:

declare
@newrole sysname,
@sql nvarchar(255);

set @newrole = 'newrole' -- give the new role a name

Set @sql = 'create role ' + @newrole + ';'
--exec (@sql);
;

--USE THIS SCRIPT TO GENERATE THE GRANT/DENY/REVOKE STATEMENTS TO FIX THE PERMISSIONS ON THE PUBLIC ROLE
with
pub as --get all permissions granted to the public role
(
select object_name(major_id) as majName, p.type, permission_name, state, state_desc, o.type_desc as otypedesc
from sys.database_permissions as p
inner join sys.objects as o on p.major_id = o.object_id
where grantee_principal_id = 0 and class=1 and o.schema_id = 1
--order by object_name(major_id)
),
def (majName, type, permission_name, state, state_desc, otypedesc) as --permissions granted to public by default
(
select 'fn_diagramobjects', 'EX', 'EXECUTE', 'G', 'GRANT', 'SQL_SCALAR_FUNCTION'
union select 'sp_alterdiagram', 'EX', 'EXECUTE', 'G', 'GRANT', 'SQL_STORED_PROCEDURE'
union select 'sp_creatediagram', 'EX', 'EXECUTE', 'G', 'GRANT', 'SQL_STORED_PROCEDURE'
union select 'sp_dropdiagram', 'EX', 'EXECUTE', 'G', 'GRANT', 'SQL_STORED_PROCEDURE'
union select 'sp_helpdiagramdefinition', 'EX', 'EXECUTE', 'G', 'GRANT', 'SQL_STORED_PROCEDURE'
union select 'sp_helpdiagrams', 'EX', 'EXECUTE', 'G', 'GRANT', 'SQL_STORED_PROCEDURE'
union select 'sp_renamediagram', 'EX', 'EXECUTE', 'G', 'GRANT', 'SQL_STORED_PROCEDURE'
),
fin as --extract all non-default permissions from the dataset
(
select * from pub
except
select * from def
)
select
*, --show object data
'revoke '+permission_name+' on '+majname+' to public', --Revoke statements
state_desc+' '+permission_name+' on '+majname+' to ' + @newrole --Grant/Deny statements
from fin
order by majName

Fortunately, the default permissions granted to the public role are few and easy to enumerate.  By selecting all permissions existing for public and removing the default permissions via the except operation, are left with all non-standard permissions for that role.  From there, the final query returns dynamic SQL for the necessary revoke and grant/deny statements.

There are a couple of holes in this script that I purposely left in.  First, the script stops at generating the scripts rather than automatically processing them.  I may at some point go back and make this a fully automated solution, but felt it was faster to copy/paste, and safer to force at least a cursory review of changes before running.

The second hole is that the script returns the GDR scripts based solely upon what is in the security tables.  I didn’t have the time or inclination to do checks for allowed permissions during the migration.  This hole isn’t immediately obvious, but important.  At the time my database was created, the developer granted permissions using the ALL operator (since deprecated, but maintained for compatibility).  This meant that all functions they had developed had insert/update/delete permissions granted; you cannot perform insert/update/delete operations on scalar functions, and SQL kindly informs you of this when you try to revoke, grant, or deny those permissions directly on a function.  By running the scripts exactly as they are returned by the code above, I was actually cleaning up unnecessary/invalid permissions in the middle of the migration.  Invalid permissions could then be removed by replacing the permission name (select, insert, etc.) with the ALL operator after the valid grant/deny statements were run.

The final step to the process is to move DB users into the new role.  This can be accomplished with the following snippet:

select 'alter role [newrole] add member ['+[name]+']'
from sys.database_principals as d
inner join <dbuser table> as t on d.[name] = t.<user id> --remove or comment this line if no users table exists in the database
where type = 's'
order by [name]

Once again, I stopped at returning dynamic SQL so I would be forced to review the results before running them.

The combination of these two scripts resulted in a successful migration of permissions to a new role, which allowed us to preserve proper security levels for DB users which are not related to the application, such as service accounts, non-privileged DBA accounts, etc.  This transition was transparent to the end users.

These scripts could also serve as a framework – a few simple modifications would allow you to transfer permissions (all or a subset) between custom roles

Advertisements

Disallow results from triggers – this isn’t the solution I was looking for

TL;DR Version:

You can set the advanced option “Disallow results from triggers” to 1 to get past the error “Message: A trigger returned a resultset and/or was running with SET NOCOUNT OFF while another outstanding resultset was active.”

Before you do this, be sure that you know the root cause of the error, and the impact of that setting – the problem may not actually be with the trigger.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

My employer’s ERP encrypts all of their stored procedures.  While this is good for them, it means that whenever something must be added to the workflow for say… saving customer records…  we must implement triggers to get the job done.  The ERP also has many popular 3rd party add-ins or modules which are forced into the same methodology.  The end result is that the customer table for instance has 11 (Eleven!!!) triggers on it.  Only one of those is in-house.

Let me just say for the record that as a general rule I really do not like triggers.  Yes, I know they have their place, and are often necessary.  That said, between the fragmented code, increased potential for blocking/deadlocks, etc. I’ve found they’re generally more trouble than they’re worth.

So – on to the issue at hand.  We are currently changing some applications in our stack, and moving to a hosted CRM solution.  As part of that move, we must build integrations for master data between the ERP and CRM systems.  One of the business requirements is that the master data key from CRM must exist in the ERP so that integrations to other systems can be created.

The CRM integration utilizes two triggers on the master data tables (one each for inserts and updates) which write to a queue table; the integration then reads from that table to pull master data into CRM and do its thing.  From there, the integration then pulls the ID from CRM and pushes it back to the ERP.

I’m sure all you DBA’s out there have spotted the first problem with this sequence – updates coming from CRM will fire the triggers, placing the record back in the queue; this obviously defeats the purpose of the queue, as nothing would ever fall out of it.  To get around this, we added code to the triggers to check the login making the update – if the update comes from the CRM service account, the trigger returns without executing any additional code.  Problem 1 solved.

With the aforementioned loop handled, we proceeded to test the integration.  First with one record – success; then with multiple records – FAILURE!  I’m guessing this is why you’ve hit this entry – the error was as follows:
"Message: A trigger returned a resultset and/or was running with SET NOCOUNT OFF while another outstanding resultset was active."
Well crap.

I’ll freely admit that this was the first time I had encountered this error.  After a few minutes of googling, I had confirmed what made intuitive sense to me, but I had never really needed to think about: triggers that return results are bad (I’m sure there will be some that disagree with that blanket statement, but I’ll stick by it).  I also learned about the advanced setting “Disallow Results from Triggers”, and how it’s set to 0 by default (allowing results) despite best practice being to not allow results from triggers…  makes perfect sense.

In all of the searching I did, it seemed like the problem was a new trigger on a table that suddenly caused the error; there were only two solutions suggested in all the sites I looked at: either fix the trigger (set nocount on, remove any select statements returning results), or set that advanced setting to 1, which affects all triggers on the server.

Armed with those bits of knowledge, I began looking at the code on my server…  and found that all of the involved triggers followed best practice.  All had nocount set to on, and none returned results via a select.

Well that was unexpected.  I had expected that at least one of the triggers would have the decency to have bad code and give me an easy fix.  Not so.  Time to dig deeper…

The next step in troubleshooting was to disable all but one of the triggers.  Because this issue cropped up during the CRM integration, we disabled the CRM triggers to prove that they weren’t the issue.  Sure enough, success with one row, failure with multiple – even with only one trigger.  Things were getting weird.

The next step was to do the obvious – set the disallow results from trigger option to 1.  As expected, everything worked, even with all triggers enabled and multiple rows pushed through the integration.  Success, right?  Not quite.

You see, we have a few other integrations that have been in place for a long time which would do mass updates on records with those triggers in place, and those NEVER FAILED.  If that setting were the solution, we should have expected to see that error long before now.  Further, our ERP has… lots…  of tables, and upwards of 10 triggers on a given table isn’t uncommon (did I mention how much I hate this model?).  So not only do we have LOTS of triggers which have never failed due to multiple resultsets, but this is spread over close to 25 databases, because each legal entity is contained in its own database for the ERP.  Where the server setting really is a server setting, I would have to go through all of the code for all of the triggers in all of the databases to ensure that this setting would not break something.  NOT GOING TO HAPPEN.

Where to look next?  Well, considering that everything was functional before the CRM integration, and that the CRM triggers follow best practice and had been proven not to be the issue, the only logical place left to look was the code for the integration itself.

The integration is built using a hosted 3rd party tool.  I set up another meeting with the vendor, and we dove in.  On my side, I set up a trace to watch activity coming from the integration.  On their side, they first ran the integration for one record.  Once again success.  In the trace, I noted that the integration ran a query to pull the record to be integrated, then ran another to push the update coming back from CRM.  I was using the default trace, so these showed up as:
Batch Starting: select top 1 * from <table>
Batch Completed: select top 1 * from <table>
Batch Starting: update <table> set...
Batch Completed: update <table> set...

Next was the run for multiple records.  This time the trace looked like this:

Batch Starting: select top 1 * from <table>
Batch Starting: update <table> set...
Batch Completed: update <table> set...
Batch Starting: update <table> set...
Batch Completed: update <table> set...

See the difference?  When running for one record, the integration is closing it’s batch before proceeding to update the record.  When running for multiple, that initial resultset is held open and processed RBAR.  Remember our error?
"Message: A trigger returned a resultset and/or was running with SET NOCOUNT OFF while another outstanding resultset was active."
This error is complaining because there is an open resultset that prevents the triggers from executing.  Why?  We could get technical and talk about locking, transaction isolation, etc., but that shouldn’t be necessary.  Suffice it to say that the triggers can’t execute with an open resultset.

The end result in this situation was that the integration was re-written.  We tried a few different ways to force that initial query batch to terminate before processing the rows, but limitations in the development tool prevented this from working.

I’d wager that the majority of the time you see this error it is due to a problem with the triggers.  When it’s not, you always have that big hammer of changing server settings.  However, as my experience shows, it is a good idea to dig a little bit to determine if the error actually is in SQL before you do anything too drastic.