well, I recently saw a video by Ben (Learn code by gaming) in which he teaches how to quickly capture screenshots to be able to analyze the image in real time using opencv. But one thing bothered me a lot, it uses the win32 lib, making it completely incompatible for systems that use Window X servers, such as Linux.
So today, in my first post on Dev.to I'll teach how you can do the same thing but in linux, using Xlib!
Watching his playlist about the Albion game bot online, we can see the evolution of his capture method, from the simplest to multi threaded, so here I'll try to do the same, from simple to multi threaded.
First of all:
HOW THIS CAN CAPTURE????
Firstly, as xlib does not have a native method to search for a window by name, we must list all windows that are active on the system.
We can do that using:
display = Xlib.display.Display()
root = display.screen().root
windowIDs = root.get_full_property(display.intern_atom('_NET_CLIENT_LIST'), X.AnyPropertyType).value
As you imagine (or should) from the name of the variable we know that we only have the id of the windows, not the names, but it's a great start!
window = display.create_resource_object('window', windowID)
window_title_property = window.get_full_property(display.intern_atom('_NET_WM_NAME'), 0)
we can obtain the name of the window by creating a new resource of type window (xlib object) and then "asking" the display for the name of that window using '_NET_WM_NAME'.
window = display.create_resource_object('window', windowID)
window_title_property = window.get_full_property(display.intern_atom('_NET_WM_NAME'), 0)
but then we realize that we have a list of ids, not just one as shown in the example. However this is perfect, as we have a list, we can iterate over it and then look for the name of the window we want to find. Something like:
for windowID in windowIDs:
window = display.create_resource_object('window', windowID)
window_title_property = window.get_full_property(display.intern_atom('_NET_WM_NAME'), 0)
searching_window_title = "albion online"
if window_title_property and searching_window_title.lower() in window_title_property.value.decode('utf-8').lower():
geometry = window.get_geometry()
width, height = geometry.width, geometry.height
pixmap = window.get_image(0, 0, width, height, X.ZPixmap, 0xffffffff)
data = pixmap.data
final_image = np.frombuffer(data, dtype='uint8').reshape((height, width, 4))
Well, this is the logic we will use to capture the screen, we can transform this into a function, like:
def capture_window(window_title):
display = Xlib.display.Display()
root = display.screen().root
windowIDs = root.get_full_property(display.intern_atom('_NET_CLIENT_LIST'), X.AnyPropertyType).value
final_image = None
for windowID in windowIDs:
window = display.create_resource_object('window', windowID)
window_title_property = window.get_full_property(display.intern_atom('_NET_WM_NAME'), 0)
searching_window_title = "albion online"
if window_title_property and searching_window_title.lower() in window_title_property.value.decode('utf-8').lower():
geometry = window.get_geometry()
width, height = geometry.width, geometry.height
pixmap = window.get_image(0, 0, width, height, X.ZPixmap, 0xffffffff)
data = pixmap.data
final_image = np.frombuffer(data, dtype='uint8').reshape((height, width, 4))
break
display.close()
return final_image
Of course, this is a very crude example, using this logic in real time can cause a terrible bottleneck in response time. let's improve this a little and add multi thread.
import numpy as np
from threading import Thread, Lock
import Xlib
import Xlib.display
from Xlib import X
class WindowCapture:
stopped = True
lock = None
screenshot = None
windowId = None
def __init__(self, window_name='Albion Online'):
self.lock = Lock()
self.screenshot = None
display = Xlib.display.Display()
try:
root = display.screen().root
windowIDs = root.get_full_property(display.intern_atom('_NET_CLIENT_LIST'), X.AnyPropertyType).value
for windowID in windowIDs:
window = display.create_resource_object('window', windowID)
window_title_property = window.get_full_property(display.intern_atom('_NET_WM_NAME'), 0)
if window_title_property and window_name.lower() in window_title_property.value.decode('utf-8').lower():
self.windowId = windowID
if not self.windowId:
raise Exception('Window not found: {}'.format(window_name))
finally:
display.close()
def get_screenshot(self):
display = Xlib.display.Display()
window = display.create_resource_object('window', self.windowId)
geometry = window.get_geometry()
width, height = geometry.width, geometry.height
pixmap = window.get_image(0, 0, width, height, X.ZPixmap, 0xffffffff)
data = pixmap.data
image = np.frombuffer(data, dtype='uint8').reshape((height, width, 4))
display.close()
return image
# threading methods
def start(self):
self.stopped = False
t = Thread(target=self.run)
t.start()
def stop(self):
self.stopped = True
def run(self):
while not self.stopped:
# get an updated image of the game
screenshot = self.get_screenshot()
# lock the thread while updating the results
self.lock.acquire()
self.screenshot = screenshot
self.lock.release()
here I used multi-thread to always capture the screen and record it in the screenshot attribute. We can use this class as follows:
import cv2 as cv
from windowCapture_linux import WindowCapture
wincap = WindowCapture('albion online')
wincap.start()
run = True
while run:
if wincap.screenshot is not None:
image = wincap.screenshot
cv.imshow("Screenshot", image)
key = cv.waitKey(1)
if key == ord('q'):
run = False
wincap.stop()
In this example I used Albion Online, but it works with any window :)
I hope this little tutorial was useful, good luck and see you later!
Top comments (0)