Our home directories are a little odd in regards to the folder structure we are using. The home directories reside in "\\server\users\home\<department>\<accountname>" which means that whenever a user changes department his/her entire home directory must be moved to a new location.
A couple weeks ago we got an increasing amount of support tickets complaining that users did not have their desktop icons anymore and that all their favourites were gone. So I started to investigate.
The eventlog gave me the first hint (please excuse it being in German):
Fehler bei der Richtlinienanwendung und beim Umleiten des Ordners "Favorites" nach "%homeshare%\Favoriten".
Umleitungsoptionen=0x1001.
Der folgende Fehler ist aufgetreten: "".
Fehlerdetails: "Der angegebene Pfadname ist ungültig.".
The path is invalid? Why? It is working for hundreds of other users just fine. So I checked the HomeDirectory property in Active Directory for that user, verified that he was indeed in that department and that the folder actually existed on the server. Everything was there.
Then I checked the access rights on the user's home directory and bingo. The user did not have any rights for his/her own home directory. Not even read rights. But why? This was supposed to be handled by an automated task.
I decided to write a little Powershell script to find out how big the problem currently was:
$dnsroot = ("*" + (Get-ADDomain).DNSRoot + "*")
$netbiosname = ((Get-ADDomain).NetBIOSName + "\")
$list = (Get-ADUser -Filter {enabled -eq $true -and homedirectory -like $dnsroot} -Properties * | Select-Object SamAccountName, homedirectory | Sort-Object homedirectory)
foreach ($entry in $list) {
if (Test-Path $entry.homedirectory -ErrorAction SilentlyContinue) {
$acl = (Get-Acl $entry.homedirectory | Select-Object -ExpandProperty access | Where-Object {$_.IdentityReference -eq ($netbiosname + $entry.samaccountname)})
if (-not $acl) {
Write-Host ($entry.homedirectory + "`tUser has no access")
}
}
}
The script found over 30 home directories. But why did we only get a handful of tickets about missing desktop files and favourites? Apparently the problem was more complex than I initially thought.
Windows is using the "HomeDirectory" attribute in AD to populate the environment variable %HomeShare% when you log onto a domain. But if that folder is not accessible it will just silently fail and the environment variable will not be created at all. So that explains the weird error message in the eventlog.
This created some fun scenarios:
- When a new user joined the company and logged onto a machine while he had no permissions on his/her home directory Windows would not be able to apply the folder redirection, fail silently and let the user work with the local folders. And no one would notice.
- Sometimes departments get renamed but the users keep their old computers. Their home directories get moved to the new department's folder though. So when such a user logs onto his/her old computer where the folder redirection had already done its magic previously the paths in the registry would still point to the old locations. But since the user had no permissions on his/her new home directory he would get error messages that his/her desktop folder and so on did not exist because Windows was not able to populate the %HomeShare% variable and update the registry.
- And whenever a user actually changed departments and got a new computer Windows would again silently fail the folder redirection and the user would end up with an empty desktop because he was looking at the local folders.
But what about that script that was supposed to set the permissions in the first place?
To understand what is going on here you need to know that users are not directly created in AD but are maintained in another directory service and then a bunch of scripts create the accounts in all the other systems based on a number of rules. And even the creation of the home directories is done that way.
The 1st-Level-Team is in charge of adding new users and part of their workflow is to check the permissions of the home directory after its automated creation. And because at some point they felt it took too long for the permissions to be set they decided to manually set them. Which makes it impossible to figure out when the whole problem with the missing permissions started.
So what was happening with the script that was supposed to set the access rights? Well, turns out there were multiple things going wrong all over the place. The first and most facepalm-worthy was that the script tried to set the permissions on the folder while the user did not exist in AD yet. But it did not stop there. The script was using a single text file as its todo list, let's call it "permissions.txt". The script would find a newly created user, write his/her name into the file and then trigger another script to do the actual permission settings. That script would then open the file and keep it open. Meanwhile the first script was still running and trying to write more names to the file which it could not do because the file was open and locked by the second script. But the first script did not realize that and marked the subsequent accounts it found as "done" and never visited them again, which left those home directories without any permissions.
To fix the issue the guy in charge of these processes changed the script to use unique filenames. Problem solved. Or not. I still find newly created home directories without permissions every time the 1st-Level-Team adds new users to the system. At first the idea was to run a sort of "cleanup" script every day at some time after 18:00 to fix any missed permissions. But that meant that newly created users wouldn not be able to fully use their computers until the next day ... and for some reason the script did not do anything. The directories were still broken even two days later.
So the script was changed to run every hour instead. This worked. For the most part ... Except when it did not. Any home directories created after some time around 16:00 still would not have any permissions set the next morning, even though the script was set to run every hour.
Turns out there was another rule set on the scripts to make them not run after business hours in order to prevent any changes from happening while the backup processes are running.
Figuring all this out and fixing the mess took a few weeks. All the while I ran my script daily in order to check for and when needed to repair the permissions manually.
So yeah, a few tickets about people not having their files on their desktop turned into a full blown mess.
Disclaimer: I do not have access to the scripts running in the background doing all the magic, so I always had to involve people from other departments and have them fix them, which obviously slowed down the process of getting this issue fixed quite a bit.
No comments:
New comments are not allowed.