In the current data era we live in, a huge number of reports are being generated every minute. Uploading and distributing these reports is an inconvenience to say the least, and a security threat if sensitive/confidential data is included in these reports.
I am currently in charge of approximately 200 analytics reports generated daily, attached to emails and sent to the according audience at different departments. While this is a time consuming task, it was not a security concern just because these emails and reports were accessed on company approved devices in a closed and highly secured network.
Now with the pandemic and the stay in home orders. The chances of downloading these reports on unsecured machines became an urgent issue.
So I decided to take advantage of Microsoft OneDrive as a central storage solution. Automating uploading these reports to a departmental folders and email just a link to the file location to my audience, reducing the chances of someone downloading the files on their machine when Microsoft OneDrive encourages web previews.
This sounds like an easy task, right? Especially if you have worked with Python and APIs before. Include the requests library, configure your Client Id and Client secret, request an AccessToken/RefreshToken, and start rolling.
Well it turned out that working with Microsoft Graph is not that easy especially if you want to schedule your application to run in the background with no human interference.
While Microsoft has a good documentation, it was still vague on many critical subjects:
- Authorization and what endpoint to use
- Authenticating using you Microsoft Account
- An running in the background (A Daemon app) requires high privileged admin access I don't have
- How to use delegate permission instead and still run your app in the background
- How to set up the header for resumable large files upload
After doing what every developer does from reading documentation to looking up a solution or idea on stack overflow, I wasn't able to find a solution using python especially for the resumable large files upload.
So I decided to publish my solution to help any fellow developer who is currently in the same shoes I was in couple weeks ago!
For the complete code please visit my GitHub
There are two major parts to this tutorial:
- Create, set up and configure the API on Azure Portal
- Write the python script
Part 1 - Create, set up and configure the API on Azure Portal
Step 1: Register your application
Go to https://portal.azure.com/#home
Azure Active Directory -> App Registration -> New Registration
- Name your API
- Accounts in this organizational directory only (Single Tenant)
- Redirect URI is not needed as our app running and authenticating in the background
- Click register
Once your new API is created, click on the API and Save the following two information for the code later:
- Application (client) ID
- Directory (tenant) ID
Then grab the OAuth 2.0 authorization endpoint (v2)
We will use it during the authorization script to get the URL link to permissions consent
Step 2: Configure the API permissions
API permissions → Add a permission → Microsoft APIs → Microsoft Graph → Delegated permissions → Select permissions Permissions needed “Sites.ReadWrite.All” and “Files.ReadWrite.All”
Step 3: Expose the API
After adding the permissions in Step 3, we have to expose the API and those permissions to the scope.
Then we need to add the client ID and select the authorized scopes we just added.
Expose an API → Add a client application → Enter Client ID → select the Authorized scopes → click add application
Step 4: Edit the manifest (Very important to allow Implicit grant)
This is a very important step in the API set. Go to Manifest and set the Oauth2IdToken and ImplicitFlow to true
Now that our API is all set and configured we can start writing some Python code!!!
Part 2- Write the python script
Our code is 2 different python scripts:
1- generateOneDriveAPIConsentURL-public.py
Script to generate the consent URL.This script is basically run once after setting the permissions in the API setup to give the user's consent to these permissions.
This is the best solution to run the main app in the background without using the app permissions (which pose high security risk as it is global high privileged and requires admin approval)
Sticking to delegate permission limit the permissions to the current user privileges and in virtually all cases, it doesn't require admin approval.
import requests import json from requests_oauthlib import OAuth2Session from oauthlib.oauth2 import MobileApplicationClient client_id = "xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx" scopes = ['Sites.ReadWrite.All','Files.ReadWrite.All'] auth_url = 'https://login.microsoftonline.com/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/oauth2/v2.0/authorize' #OAuth2Session is an extension to requests.Session #used to create an authorization url using the requests.Session interface #MobileApplicationClient is used to get the Implicit Grant oauth = OAuth2Session(client=MobileApplicationClient(client_id=client_id), scope=scopes) authorization_url, state = oauth.authorization_url(auth_url) consent_link = oauth.get(authorization_url) print(consent_link.url)This script basically connect the Microsoft authorization V2. endpoint, send the Client ID and the scope of permission we are asking for then a URL will be generated and sent back in the terminal
c:/Users/jsnmtr/Code/onedrive/generateOneDriveAPIConsentURL-public.py https://login.microsoftonline.com/xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx/oauth2/v2.0/authorize?response_type=token&client_id=xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx&scope=Sites.ReadWrite.All+Files.ReadWrite.All&state=xxxxxxxxxxxxxxxxxxxxxxxx
Open the link in your web browser and click to accept to accept the permissions requested
2- AutomatedOneDriveAPIUploadFiles-public.py
This is the main script, it creates a public client application using the MSAL library, request a token on behalf of the user, gain access to Microsoft Graph and use the OneDrive API to upload files.
Basic flow:
-Importing libraries
import os import requests import json import msal
-Configuration
CLIENT_ID = 'xxxxxxxx-xxxxxxx-xxxxxx-xxxxxxx-xxxxxxxxxx' TENANT_ID = 'xxxxxxxx-xxxxxxx-xxxxxx-xxxxxxx-xxxxxxxxxx' AUTHORITY_URL = 'https://login.microsoftonline.com/{}'.format(TENANT_ID) RESOURCE_URL = 'https://graph.microsoft.com/' API_VERSION = 'v1.0' USERNAME = 'xxxxxxxxx@xxxxxx.xxx' #Office365 user's account username PASSWORD = 'xxxxxxxxxxxxxxx' SCOPES = ['Sites.ReadWrite.All','Files.ReadWrite.All'] # Add other scopes/permissions as needed.
-Create a public client application using the Microsoft Authentication Library (MSAL)
#Creating a public client app, Aquire a access token for the user and set the header for API calls cognos_to_onedrive = msal.PublicClientApplication(CLIENT_ID, authority=AUTHORITY_URL)
-Acquire a token from Microsoft identity platform endpoint to access Microsoft Graph API
token = cognos_to_onedrive.acquire_token_by_username_password(USERNAME,PASSWORD,SCOPES)
-Set the request header with access token
headers = {'Authorization': 'Bearer {}'.format(token['access_token'])}
-Read all the file in source directory
so we loop into the directory, get the file path, file size and read the data in the file
#Looping through the files inside the source directory for root, dirs, files in os.walk(cognos_reports_source): for file_name in files: file_path = os.path.join(root,file_name) file_size = os.stat(file_path).st_size file_data = open(file_path, 'rb')
-If the file is less than 4mb:
if file_size < 4100000:
-perform a simple upload
#Perform simple upload to the OneDrive API r = requests.put(onedrive_destination+"/"+file_name+":/content", data=file_data, headers=headers)
-If the file is larger than 4mb:
Create an upload session
upload_session = requests.post(onedrive_destination+"/"+file_name+":/createUploadSession", headers=headers).json()
-Divide the file into byte chunks
total_file_size = os.path.getsize(file_path) chunk_size = 327680 chunk_number = total_file_size//chunk_size chunk_leftover = total_file_size - chunk_size * chunk_number chunk_data = f.read(chunk_size) start_index = i*chunk_size end_index = start_index + chunk_size
-Set the header to match the starting index and end index of the byte chunk
headers = {'Content-Length':'{}'.format(chunk_size),'Content-Range':'bytes {}-{}/{}'.format(start_index, end_index-1, total_file_size)}
-Upload chunks
chunk_data_upload = requests.put(upload_session['uploadUrl'], data=chunk_data, headers=headers)
For the complete code please visit my GitHub
To me, anytime you achieve a goal, is a success story and by writing this code I achieved my goal to automate my reports upload process to OneDrive! SUCCESS!
Top comments (10)
Hi Jason,
Thank you very much for this article, it is very well detailed and explanatory.
When following your steps for the authorisation, when I run the python script, pick up the account and approve the access, I end up on a Microsoft page saying AADSTS500113: No reply address is registered for the application.
Could you advise on what to do please?
Thank you in advance,
Arnaud
Error message on webpage when doing auth "python3 generateOneDriveAPIConsentURL-public.py":
AADSTS500113: No reply address is registered for the application
stackoverflow.com/questions/662627...
Added redirect url solved this. (I use google.com as redirect url)
Then, error message on cmd line when doing "python3 AutomatedOneDriveAPIUploadFiles-public.py":AADSTS7000218: The request body must contain the following parameter: 'client_secret' or 'client_assertion'
google found this solution:
stackoverflow.com/questions/456094...
In the Manifest also you can control this by setting:
"allowPublicClient": true
After setting this, it works.
Is there a reason to use msal instead of github.com/OneDrive/onedrive-sdk-p...?
Really useful article otherwise, thanks!
Update: Apparently it's deprecated. pypi.org/project/onedrivesdk/ :-/ So I guess just a note for others that come here that since this, pypi.org/project/graph-onedrive/ has been released and might be worth checking out also.
Hi I'm new, a university student. I want to use something like this to upload trainign files and reports from a RaspberryPi. Do you think it would work? I can't seem to get the OAuth 2.0 authorization endpoint (v2). Either I don't know where to find it or it's not there.
Error message on webpage when doing auth "python3 generateOneDriveAPIConsentURL-public.py":
AADSTS500113: No reply address is registered for the application
stackoverflow.com/questions/662627...
Added redirect url solved this. (I use google.com as redirect url)
Then, error message on cmd line when doing "python3 AutomatedOneDriveAPIUploadFiles-public.py":AADSTS7000218: The request body must contain the following parameter: 'client_secret' or 'client_assertion'
google found this solution:
stackoverflow.com/questions/456094...
In the Manifest also you can control this by setting:
"allowPublicClient": true
After setting this, it works.
You helped me a lot, thanks for sharing this information! It worked perfectly in python
Hugs from Brazil :)
Hi, thanks so much this was extremely helpful! My only question is, in this block of code:
Looping through the files inside the source directory
for root, dirs, files in os.walk(cognos_reports_source):
for file_name in files:
file_path = os.path.join(root,file_name)
file_size = os.stat(file_path).st_size
file_data = open(file_path, 'rb')
What is cognos_reports_source?
Cheers
its the directory that holds the files you wish to upload
Hi Thanks a lot for this it helped a lot.
@jason I have an issue when clicking Accept in the link that printed from python AutomatedOneDriveAPIUploadFiles-public.py:
Please help me.