Feeding Key Presses to Reluctant Games in Python

03/15/2019 | Tutorials

Controlling the games externally by simulating keypresses and mouse movements is way more problematic than it should be. In my most recent experience, namely, because of their habit of reading the input through something called DirectInput. This post shows how I dealt with it in Python.

Disclaimer: Most of the claims made in this post are my assumptions based on a very naïve research I have done while troubleshooting something I was working on. None of it should be taken as a fact, or trusted blindly without verifying it for yourself first.

Few times in the past I was working on different kinds of game bots. To automate some tedious task, to avoid grinding for hours or to gain an advantage over other players (non-competitive and for fun only, I do understand that cheating has no place in competitive games).

Most of the time I tried to write the bot that would simulate user interaction - sending mouse and keyboard messages to the game - to avoid fast detection by anti-cheat SW that checks for the modified game.

One of the techniques I tried to employ was interacting with other players in the game chat. Usually to let them know that I (bot) am very new to the game, that I am still learning etc., to prevent them from reporting me and being discovered.

Let's say that to open the chat, the player needs to press the T key on the keyboard, write the message (sequence of character key presses) and hit the ENTER key to send it.

Naïvely, I would write the three functions:

  • get_game_window() - searches for the Window and brings it to the foreground.
  • type_key() - presses and releases the key of my choosing.
  • send_message(message) - chains pressing T key, writing the message and hitting ENTER key to send it.

Surely enough, I am not the first one interested in such functionality, the library already exists and using it makes implementation of the latter two functions trivial:

from pynput.keyboard import Key, Controller
from window_helper import WindowMgr

keyboard = Controller()
w = WindowMgr()

def get_game_window():
   w.find_window_wildcard("Minecraft 1*")    # Game window is named 'Minecraft 1.13.1' for example.
   w.set_foreground()


def type_key(key):
   keyboard.press(key)
   keyboard.release(key)

def send_message(message):
   type_key('t')           # Opens chat window.

   for char in message:    # Types characters one by one.
       type_key(char)
   
   type_key(Key.enter)     # Finally, submits the message.

The code above will not run unless you have installed the pynput package on your system (pip install pynput should do the trick).

The send_message(message) method sends the keystroke messages to the window that is in the foreground when the code is run, which by default, is not the game you want to direct your messages to. And that's where the get_game_window() method and little a bit of timing come into play. In the snippet above, I am importing WindowMgr from window_helper.py, which is a class that you can find between my GISTs.

All of this combined together should be enough to locate the Minecraft game window, open the chat, write in the message and submit it, right? Well, as it happens, wrong!

The code will successfully bring the Minecraft window to the foreground, open the chat, write the message even, but it will not submit it. From what I could gather this is because the standard way of simulating keystrokes takes advantage of win32 API calls and most of the games take the player input for certain keys and key combinations from the DirectInput.

Further reading on the topic convinced me that DirectInput is a good thing for the games and for the players, but a hell for me as a bot developer. I will not go into the details of DirectInput workings and functionality, I am sure you can google it for yourself. Instead, I will present the solution I have implemented.

Final Solution for Games Utilizing DirectInput

First of all, forget about the pynput package. It is not possible to painlessly combine it with this solution, I tried for some time, but it ended up breaking things.

After much googling, I ran into this Reddit thread where user Final_Spartan was trying to simulate mouse movements and another user, SerpentAI, rushed into their aid. SerpentAI is an author of SerpentAI Game Agent Framework which by chance, solves this issue for us in their native_win32_input_controller.py file.

We basically need the two things from this file:

  1. Methods to simulate key presses.
  2. Simplified character map and reference table where to look for key codes.

The first part is easily addressed - we copy the following classes: KeyBdInput(), HardwareInput(), MouseInput(), Input_I() and Input() and then we copy and modify methods to press and release keys. We should now have something like this:

import os
import ctypes
import win32api

PUL = ctypes.POINTER(ctypes.c_ulong)

class KeyBdInput(ctypes.Structure):
   _fields_ = [("wVk", ctypes.c_ushort),
               ("wScan", ctypes.c_ushort),
               ("dwFlags", ctypes.c_ulong),
               ("time", ctypes.c_ulong),
               ("dwExtraInfo", PUL)]


class HardwareInput(ctypes.Structure):
   _fields_ = [("uMsg", ctypes.c_ulong),
               ("wParamL", ctypes.c_short),
               ("wParamH", ctypes.c_ushort)]


class MouseInput(ctypes.Structure):
   _fields_ = [("dx", ctypes.c_long),
               ("dy", ctypes.c_long),
               ("mouseData", ctypes.c_ulong),
               ("dwFlags", ctypes.c_ulong),
               ("time", ctypes.c_ulong),
               ("dwExtraInfo", PUL)]


class Input_I(ctypes.Union):
   _fields_ = [("ki", KeyBdInput),
               ("mi", MouseInput),
               ("hi", HardwareInput)]


class Input(ctypes.Structure):
   _fields_ = [("type", ctypes.c_ulong),
("ii", Input_I)]

def press_key(key):
   extra = ctypes.c_ulong(0)
   ii_ = Input_I()

   flags = 0x0008

   ii_.ki = KeyBdInput(0, key, flags, 0, ctypes.pointer(extra))
   x = Input(ctypes.c_ulong(1), ii_)
   ctypes.windll.user32.SendInput(1, ctypes.pointer(x), ctypes.sizeof(x))

def release_key(key):
   extra = ctypes.c_ulong(0)
   ii_ = Input_I()

   flags = 0x0008 | 0x0002

   ii_.ki = KeyBdInput(0, key, flags, 0, ctypes.pointer(extra))
   x = Input(ctypes.c_ulong(1), ii_)
   ctypes.windll.user32.SendInput(1, ctypes.pointer(x), ctypes.sizeof(x))

With the help of the code above we should be able to press and release keys as we like. Now we need the character map and a reference table. In the linked reference table, we can look up what character code (e.g. 0x1C for ENTER key) need to be passed into the press_key() and release_key() functions.

Given the example from the beginning of this post, let's locate Minecraft window, bring it to the front, open the chat, write in "hello" and submit it by hitting ENTER key:

get_game_window()
press_key(0x14)    # t - opens chat
release_key(0x14)

press_key(0x23);release_key(0x23); # h
press_key(0x12);release_key(0x12); # e
press_key(0x26);release_key(0x26); # l
press_key(0x26);release_key(0x26); # l
press_key(0x18);release_key(0x18); # o

press_key(0x1C);release_key(0x1C); # Submit it

And here it comes! The Minecraft game registers all of the keys pressed, including the ENTER key.

In a more realistic scenario, you would probably write yourself a wrapper to ease sending out the whole words or sentences, but I decided not to do it here for the sake of brevity. I am putting all of this on my GitHub and you are welcome to dig into it further there. The basic character map mentioned earlier is included.

As always, if you have questions, do not hesitate to reach out, I will do my best to answer.