The community fix dataset
WHERE THE 58,000+ FIXES COME FROM
FixMyPrint's fix-intelligence engine is built on posts extracted from r/FixMyPrint and related 3D printing communities on Reddit. Each post was processed to extract:
- •The printer model, filament type, and slicer software
- •The failure being described
- •Every fix suggested in the comment thread
- •Whether the original poster came back to confirm a fix worked
We classify each fix into one of three confirmation buckets:
The original poster replied to say the fix worked. This is our strongest signal.
The fix received the highest community votes but no direct OP confirmation.
The fix was mentioned but neither confirmed nor top-voted.
Success rates are calculated from Bucket A and B only. Bucket C fixes are only shown as a last resort when no higher-confidence data exists for that issue type.
How fixes are ranked
THE RANKING ALGORITHM
When you describe a print failure, FixMyPrint:
- 1Maps your selected failure into one of 16 canonical issue types
- 2Queries our fix database for your specific issue
- 3Applies a three-tier lookup (below)
- •First: fixes confirmed for your exact printer model
- •Second: fixes confirmed for your filament type across all printers
- •Third: globally confirmed fixes for your issue type
Fixes are ranked by confirmed count and success rate within each tier. A fix with 10 OP confirmations ranks above one with 2, even if the lower-ranked fix has a higher raw success rate from a smaller sample.
Minimum thresholds apply- a fix needs at least 1 OP confirmation and a 20% success rate to appear in the primary results. Below that threshold, fixes are labeled as community suggestions rather than confirmed recommendations.
How bad data is handled
HOW BAD DATA IS HANDLED
Single confirmations are not treated as ground truth. A fix confirmed once by one user in one session is a weak signal - environmental factors, coincidental changes, or an unrelated fix applied at the same time can all produce a false positive.
The bucket system handles this naturally. A fix reaches Bucket A only when the original poster explicitly confirmed it worked. Fixes accumulate confirmations across independent users and independent sessions - a fix confirmed 50 times across different printers, climates, and filament brands is a meaningfully different signal than one confirmed once. The ranking weights confirmation count heavily, so consensus emerges from volume rather than being declared from a single data point.
The hardest case is the hidden variable - a user who dries their filament, applies a slicer fix, and attributes the success to the slicer fix. The current mitigation: filament drying appears as its own ranked cause in every relevant diagnosis. If drying is consistently co-applied with a slicer fix and success rates without drying are lower, that pattern surfaces in the data over time. Bad data is an inherent cost of passive collection. The defense is volume and consensus, not perfection at the individual label level.
The deterministic settings engine
WHY SETTINGS ARE DETERMINISTIC
The Settings Generator is entirely rule-based. No AI, no machine learning, no token sampling.
We tested generative settings early in development and rejected the approach for three reasons:
- 1Hallucination risk- language models generate plausible-sounding values that may be outside safe operating ranges for specific filaments or printers
- 2Slicer naming inconsistency- every slicer uses different field names for the same concept. Generative models have no reliable mapping mechanism
- 3Inconsistency across sessions- the same question produces different answers from session to session. For a tool users rely on to fix expensive print failures, inconsistency destroys trust
The engine instead uses a deterministic resolution chain:
Every output value is auditable- you can trace it back through the chain to the rule that produced it. Safe parameter bounds are enforced at the final step before output, sourced from manufacturer specifications and filament datasheets.
Pre-flight safety checks
WHAT V2 CHECKS BEFORE GENERATION
Before settings are returned, FixMyPrint runs 8 hardware-aware checks against the selected printer, filament, nozzle, hotend, bed surface, and available storage context.
- •Enclosure compatibility
- •Nozzle abrasion risk
- •Temperature ceiling vs printer hotend
- •PTFE safety
- •Volumetric flow limits
- •Moisture risk from filament, storage, dried date, and environment
- •Bed surface compatibility
- •Extruder type compatibility
Hard stops block genuinely risky combinations and explain what to change. Soft stops and advisories preserve an escape path for users who know their setup, while still making the risk explicit.
The intent is not to fight expert users. It is to prevent common, expensive mistakes before filament, nozzles, or hotends pay the price.
The filament database
WHERE FILAMENT DATA COMES FROM
FixMyPrint's filament database aggregates data from 10 sources:
- •Ultimaker/Cura
fdm_materials(CC0 licensed) - •PrusaSlicer filament profiles (Apache 2.0)
- •OrcaSlicer filament profiles
- •Bambu Studio filament profiles
- •SpoolmanDB community filament database
- •Open Filament Database (OpenFilamentCollective)
- •Marlin firmware printer configurations (hardware limits)
- •Klipper printer configurations (hardware limits)
- •3DFilamentProfiles.com community data
- •Hand-curated manufacturer specifications
Where sources disagree, we apply conflict resolution rules: community-verified data takes precedence over manufacturer claims when the discrepancy exceeds 15°C. Conservative ranges are applied when no community data exists to resolve a conflict.
1,317 brand+material combinations are currently in the database. Coverage is highest for PLA (194 brands), PETG (164 brands), and ABS (131 brands).
AI normalization scope
AI CLASSIFIES INPUTS. THE ENGINE RETURNS FIXES.
FixMyPrint does not use AI to generate settings or fix advice. AI is used as an input normalization layer: it maps user-described print problems into one of 16 canonical issue types and, for Photo Diagnosis, detects the visible failure type from an uploaded image. The fix-intelligence layer then ranks community-confirmed fixes.
Fix My Print- wording to canonical issue
GPT-4o reads the user's problem description and returns structured JSON: three ranked candidate causes, each with a required snake_case issue id from the allowed list, a label, root_cause, confidence, severity, and likely_causes. Server-side validation rejects unknown issue IDs and retries or salvages responses where root_cause simply repeats the issue label.
Photo Diagnosis- visual failure detection (Pro)
GPT-4o Vision analyzes your photo against specific visual signatures for each failure type. The visual identification feeds into the same fix-intelligence engine. AI classifies the input. The deterministic engine returns the fixes.
AI normalizes the problem statement or photo into a canonical issue. Fix recommendations and settings come from deterministic logic and community-confirmed data.
The vision data layer
THE VISION DATA LAYER - WHERE IT'S GOING
Photo Diagnosis currently uses GPT-4o Vision as a generalist interpreter - it identifies failure modes against its general training, not against a curated corpus of labeled FixMyPrint failures. This works well for visually distinct failures (stringing, warping, elephant foot) and struggles with subtle ones (5% under-extrusion on textured PEI, where the failure is dimensional rather than visual).
The longer-term architecture: every photo diagnosis that receives a confirmed fix is a labeled training example - printer model, filament, failure image, confirmed outcome. That corpus grows passively with every Pro user who uploads a photo and confirms a result. When the volume reaches statistical significance per failure mode, the vision layer stops relying on a general-purpose model and starts being trained on FixMyPrint-specific failure signatures.
A single user confirmation is not a golden label. The threshold for an image-fix pair to be considered high-confidence training data requires multiple independent confirmations of the same failure mode, same fix, same printer family, across different users and sessions. The same consensus principle that governs the text fix data applies to the image data.
What we don't claim
HONEST LIMITATIONS
- •FixMyPrint works best for the 16 most common FDM failure types. Obscure or hardware-specific failures may not have enough community data to produce confident recommendations.
- •Fix success rates reflect community data, not controlled experiments. Real-world results depend on factors we can't account for- room temperature, filament brand variation, hardware wear, and user skill.
- •The dataset is weighted toward popular printers and filaments. A Bambu Lab P1S running Bambu PLA has far more community data than a Vivedino Raptor running specialty nylon.
- •Community data has recency bias. Fixes that worked 3 years ago may be less relevant for newer printer models or updated slicer versions.
Open to scrutiny
WE WELCOME SKEPTICISM
If you're a researcher, journalist, or developer who wants to dig deeper- we're happy to share more detail about our methodology, data collection process, or engine design.
Ready to put the engine to work on your failed print?
Try FixMyPrint Free