DEV Community

Matheus Fernandes
Matheus Fernandes

Posted on

python Screenshot for linux using Xlib

well, I recently saw a video by Ben (Learn code by gaming) in which he teaches how to quickly capture screenshots to be able to analyze the image in real time using opencv. But one thing bothered me a lot, it uses the win32 lib, making it completely incompatible for systems that use Window X servers, such as Linux.

So today, in my first post on Dev.to I'll teach how you can do the same thing but in linux, using Xlib!

Watching his playlist about the Albion game bot online, we can see the evolution of his capture method, from the simplest to multi threaded, so here I'll try to do the same, from simple to multi threaded.

First of all:

HOW THIS CAN CAPTURE????

Firstly, as xlib does not have a native method to search for a window by name, we must list all windows that are active on the system.

We can do that using:

display = Xlib.display.Display()
root = display.screen().root
windowIDs = root.get_full_property(display.intern_atom('_NET_CLIENT_LIST'), X.AnyPropertyType).value
Enter fullscreen mode Exit fullscreen mode

As you imagine (or should) from the name of the variable we know that we only have the id of the windows, not the names, but it's a great start!

window = display.create_resource_object('window', windowID)
window_title_property = window.get_full_property(display.intern_atom('_NET_WM_NAME'), 0)
Enter fullscreen mode Exit fullscreen mode

we can obtain the name of the window by creating a new resource of type window (xlib object) and then "asking" the display for the name of that window using '_NET_WM_NAME'.

window = display.create_resource_object('window', windowID)
window_title_property = window.get_full_property(display.intern_atom('_NET_WM_NAME'), 0)
Enter fullscreen mode Exit fullscreen mode

but then we realize that we have a list of ids, not just one as shown in the example. However this is perfect, as we have a list, we can iterate over it and then look for the name of the window we want to find. Something like:

for windowID in windowIDs:
        window = display.create_resource_object('window', windowID)
        window_title_property = window.get_full_property(display.intern_atom('_NET_WM_NAME'), 0)
        searching_window_title = "albion online"

        if window_title_property and searching_window_title.lower() in window_title_property.value.decode('utf-8').lower():
            geometry = window.get_geometry()
            width, height = geometry.width, geometry.height

            pixmap = window.get_image(0, 0, width, height, X.ZPixmap, 0xffffffff)
            data = pixmap.data
            final_image = np.frombuffer(data, dtype='uint8').reshape((height, width, 4))
Enter fullscreen mode Exit fullscreen mode

Well, this is the logic we will use to capture the screen, we can transform this into a function, like:

def capture_window(window_title):
    display = Xlib.display.Display()
    root = display.screen().root
    windowIDs = root.get_full_property(display.intern_atom('_NET_CLIENT_LIST'), X.AnyPropertyType).value
    final_image = None

    for windowID in windowIDs:
        window = display.create_resource_object('window', windowID)
        window_title_property = window.get_full_property(display.intern_atom('_NET_WM_NAME'), 0)
        searching_window_title = "albion online"

        if window_title_property and searching_window_title.lower() in window_title_property.value.decode('utf-8').lower():
            geometry = window.get_geometry()
            width, height = geometry.width, geometry.height

            pixmap = window.get_image(0, 0, width, height, X.ZPixmap, 0xffffffff)
            data = pixmap.data
            final_image = np.frombuffer(data, dtype='uint8').reshape((height, width, 4))
            break

    display.close()
    return final_image
Enter fullscreen mode Exit fullscreen mode

Of course, this is a very crude example, using this logic in real time can cause a terrible bottleneck in response time. let's improve this a little and add multi thread.

import numpy as np
from threading import Thread, Lock
import Xlib
import Xlib.display
from Xlib import X

class WindowCapture:
    stopped = True
    lock = None
    screenshot = None
    windowId = None

    def __init__(self, window_name='Albion Online'):
        self.lock = Lock()
        self.screenshot = None 
        display = Xlib.display.Display()
        try:
            root = display.screen().root
            windowIDs = root.get_full_property(display.intern_atom('_NET_CLIENT_LIST'), X.AnyPropertyType).value

            for windowID in windowIDs:
                window = display.create_resource_object('window', windowID)
                window_title_property = window.get_full_property(display.intern_atom('_NET_WM_NAME'), 0)

                if window_title_property and window_name.lower() in window_title_property.value.decode('utf-8').lower():
                    self.windowId = windowID

            if not self.windowId:
                raise Exception('Window not found: {}'.format(window_name))
        finally:
            display.close()

    def get_screenshot(self):
        display = Xlib.display.Display()
        window = display.create_resource_object('window', self.windowId)

        geometry = window.get_geometry()
        width, height = geometry.width, geometry.height

        pixmap = window.get_image(0, 0, width, height, X.ZPixmap, 0xffffffff)
        data = pixmap.data
        image = np.frombuffer(data, dtype='uint8').reshape((height, width, 4))
        display.close()
        return image

    # threading methods
    def start(self):
        self.stopped = False
        t = Thread(target=self.run)
        t.start()

    def stop(self):
        self.stopped = True

    def run(self):
        while not self.stopped:
            # get an updated image of the game
            screenshot = self.get_screenshot()
            # lock the thread while updating the results
            self.lock.acquire()
            self.screenshot = screenshot
            self.lock.release()
Enter fullscreen mode Exit fullscreen mode

here I used multi-thread to always capture the screen and record it in the screenshot attribute. We can use this class as follows:

import cv2 as cv
from windowCapture_linux import WindowCapture

wincap = WindowCapture('albion online')

wincap.start()

run = True

while run:
    if wincap.screenshot is not None:
        image = wincap.screenshot
        cv.imshow("Screenshot", image)
        key = cv.waitKey(1)
        if key == ord('q'):
            run = False
            wincap.stop()

Enter fullscreen mode Exit fullscreen mode

In this example I used Albion Online, but it works with any window :)

I hope this little tutorial was useful, good luck and see you later!

Top comments (0)