loading...

Analysis of Administrator Propaganda: Story of a web service troubleshoot, without touching the code.

tayyebi profile image Tayyebi ・3 min read

Hooray! It's Friday (Weekend in bloody middle east) and time to write a blog post. And wish you are good.

Problem

In a troubleshoot problem we had a cute experience yesterday. And It was about '403 Forbidden Error' from company's server to a third party business's API.

Alt Text

Everything was fine until this week. And no code was changed, nothing in network was moved, no new firewall policy was applied, and everything was totally same as previous weeks.

But Server B (which requests last image from Server A), was accepting 'HTTP Error 403' and everyone was shocked. Out IIS logs also confirmed that it was our side.

We just reviewed App 1 and App 2 codes. The idea behind App 1 is to download images from cameras, using ffmpeg, convert them and keep a copy of that for archive and store last image in a directory, with a specific name, so App 2 (Which is a asmx web service also), can wait for a request from Server B to send image byte array as response.

Troubleshoot steps:

1- Double-checked network architecture and gained permission to review network engineers reports: nothing.

2- Double-checked git changes: nothing.

3- Double-checked the SOAP API changes with third-party business: No changes was applied, and We were the only contractor who had this error.

4- Reviewed the code, database was OK, and only thing which is not under our control is File System who is the accused 🔍🎩 !

5- Reviewed the GetFiles() function documentation from Microsoft website. Took a look in Exceptions section and Voilà!

UnauthorizedAccessException
The caller does not have the required permission.

6- Definitely it was GetFiles() but why? Maybe when ffmpeg was downloading and converting image (APP 1), App 2 was requesting the file because of critical section we web service was broken. That was also wrong because ffmpeg's has a different behavior that creates the file at the end, and instantly release it.

7- Key question: Which user is running App 1?: Administrator. And which user is running App 2?: IIS_IUSRS! that's it!

8- App 1 is triggered from Windows Scheduler every one minute to take a snapshot, and we just simply set user as IIS_IUSRS in scheduler tasks from security options>Change User or Group.

Conclusion.

  • Not always it's because of code. Don't blame developers!

  • Maybe IIS has a list in cache which when a file creates in it's website directory, that list keeps permission details about that file, so when ffmpeg creates the file and then updates the permissions after logic end, IIS does not update the file details instantly after modification. And this will cause exceptions in higher layers, especially when folder gets heavier in time.

  • Use powershell more. All those stuff could be simply handled using .bat and basic Windows commands. Complexity makes troubleshooting difficult.

  • Comment any code, any where, even if it is as simple as a foreach thing in GetFiles()

  • When it comes to a code which is going to run million times a day, check each obvious line of code again, and Copy-Paste potential errors from documentation to code comments with link and details.

Hope you will let me know if there is any comments. Thank you!

Discussion

pic
Editor guide