โฌ
๏ธRead Part 2: Training the Model
โก๏ธRead Part 3: Real-Time Inference โ Watching JumpNet Come Alive
๐ค Why Build a Data Pipeline in the First Place?
Before any model could learn to play, it first needed to observe. That meant building a full pipeline to capture what the game sees โ and how a human reacts.
My goal was to create a dataset that records:
- What was on the screen (image)
- Whether a jump action was taken (label)
- Which key was pressed (multi-hot key vector)
- How long the key was held (duration)
- The phase (press/release)
- The frame index (for video sync)
You can explore how I built the actual snipping, key-logging, and GUI tools in detail here:
๐ Modular Snip Recorder โ Part 1
๐ Modular Viewer โ Part 2
๐ Github for Data Tool
๐งฑ Step-by-Step Data Pipeline
๐ฏ Step 1: Extracting Keypress Events (Positive Samples)
I started with raw .npz files generated by my snipping tool. Then, I cleaned and extracted only the first pressโrelease pairs using the script below:
# Data_preproccer.py
...
entry_press = (
press_image, # (227, 227, 3)
1.0, # label = jump
press_multi_hot, # keys_raw
hold_vec, # how long the key was held
"press", # phase
press_frame # video sync frame
)
This step ensures that we only feed clean, clear decision points into the model.
๐ Step 2: Data Augmentation for Robustness
Because the number of positive samples was relatively small, I applied basic image augmentations to multiply the diversity:
- Horizontal flip
- Brightness variation
- Gaussian noise
- Horizontal shift
# Data_argumentation.py
if random.random() < 0.5:
aug_img = cv2.flip(aug_img, 1) # Flip
...
aug_img = np.clip(aug_img + noise, 0, 1) # Add noise
The final positive dataset was tripled through these augmentations.
โ Step 3: Extracting Negative Samples from Video
I wanted to teach the model not to jump as well. So I went back to the raw video and excluded ยฑ2 frames around every actual jump. Every other frame became a negative:
# Data_Negativ_scrapping.py
if frame_idx not in excluded:
entry = [frame_rgb, 0.0, [0.], [0.], "press", frame_idx]
These "do nothing" frames are crucial โ without them, the model would jump all the time.
โ๏ธ Step 4: Downsampling the Negative Data
Since there were too many negative frames, I randomly selected 500 of them for balance:
# Data_reducier.py
indices = np.random.choice(total, 500, replace=False)
๐งฌ Step 5: Merging All Data into One File
I then merged positive + negative data and shuffled everything:
# Data_merge_final.py
all_data = np.concatenate([pos_data, neg_data], axis=0)
np.random.shuffle(all_data)
np.savez(output_path, data=all_data)
๐ฆ Final Dataset Snapshot
Here's what a few samples look like:
๐ฆ Total Samples: 1778 (1278 positive + 500 negative)
๐ Entry #1
image.shape : (227, 227, 3)
label : 1.0
keys_raw : [1.]
hold_duration : [0.294]
phase : press
frame_index : 7720
๐ Entry #3
label : 0.0
hold_duration : [0.0]
frame_index : 872
Each entry is a tuple of:
(
image, # RGB, normalized
label, # 1 = jump, 0 = no jump
keys_raw, # multi-hot key state
hold_duration, # seconds
phase, # "press"
frame_index # for video mapping
)
๐คช Bonus: File Explorer + Sample Inspector
Once data was collected, I needed a way to explore it visually and catch any outliers. Thatโs where my Viewer tool came in handy.
๐ Check out how I built it here:
Modular Viewer โ Data Explorer Tool
๐ฏ Wrap-up
With this pipeline, I went from chaotic video footage to a neat, clean, and labeled dataset tailored for behavior cloning. Whether the model needs to jump or stay still, now it has the context to make a decision โ just like a human would.
In the next post, Iโll break down how I trained the model, tuned hyperparameters, and evaluated its performance. Stay tuned!
๐ Coming Up Next: Part 2 โ Model Training & Evaluation
In the next post, weโll move from data to decisions.
Iโll walk you through how I built the model that actually uses this data โ covering architecture, training process, and performance evaluation.
Part 2 โ From Pixels to Policy: Training JumpNet to Make the Right Move ๐
๐ Repo & Resources
*๐ GitHub: Dataset Tooling + Viewer
*๐ GitHub: Data Tool, Train and Simulation Codes
"Garbage in, garbage out."
This is the phase where we make sure the input is not garbage. ๐
Top comments (0)