In a groundbreaking application of machine learning to the study of state violence, recent research delves deeply into the factors driving extrajudicial killings in Colombia and Mexico. Using advanced predictive models, including SuperLearner ensemble techniques and XGBoost algorithms, researchers seek to unravel the complex interplay of geographic, socioeconomic, military, and political variables that shape patterns of human rights violations in these two Latin American nations. While the analytical framework maintains a predictive rather than causal focus, the feature importance scores derived from these models provide fresh insight into which factors most strongly influence the model’s predictions and, by extension, the contextual risk factors associated with deadly state violence.
The analysis reveals that in Colombia, geographic and environmental variables stand out as the most critical determinants of extrajudicial killings. This category encompasses indicators such as whether an area is rural, the size of the municipality, altitude, rainfall, soil quality, proximity to major urban centers like Bogota, and its location within specific regional contexts such as the Andean or Amazon regions. These elements collectively capture the logistical and accessibility challenges faced by state actors, which in turn correlate with higher incidences of violent state behavior. The implication is that remote or environmentally difficult-to-access regions may experience diminished state oversight, producing environments where abuses by military or paramilitary forces are more likely to occur without immediate accountability.
In juxtaposition to geographic factors, socioeconomic and demographic variables also weigh notably in Colombia’s prediction models. Markers of social vulnerability such as unemployment rates, unmet basic needs, educational attainment disparities, and the historical presence of minorities significantly contribute to the risk estimates of extrajudicial violence. These results underscore the broader structural divides and social marginalization that predispose certain communities to violence from state actors. Importantly, these variables suggest that social exclusion and deeply embedded inequalities create fertile ground for coercive state practices, as marginalized populations find themselves more susceptible to abuses in environments where formal protections and institutional supports are weak or absent.
Contrary to common assumptions, indicators linked directly to military and security operations—such as military presence intensity, coca cultivation zones, ongoing conflict intensity, and the rank of military commanders—carry less predictive weight than expected in the Colombian context. This finding challenges the simplistic narrative that the mere presence of armed conflict or security forces directly drives patterns of extrajudicial killings. Instead, it points to a nuanced dynamic whereby military actions, though important, are embedded in a broader environmental and social matrix that modulates their likelihood and intensity. This incorporates the possibility that military forces operate differently depending on geographic isolation and socio-demographic vulnerability, shaping the distribution of violence in complex and non-linear ways.
Political and institutional factors, including judicial capacity and the extent of social mobilization or church presence, come out as having the least influence on these models’ outcomes in Colombia. This observation may indicate that, while institutional qualities and political alignments are crucial for long-term governance and legal accountability, they possess relatively limited direct power to predict immediate or short-term variations in state-led violent acts. It suggests a disjunction between institutional frameworks and the ground realities of coercive security tactics, potentially highlighting deficiencies in institutional reach or effectiveness in certain localized contexts.
Turning to Mexico, the story shifts with socioeconomic and demographic factors emerging as the dominant predictors of extrajudicial killings across both the XGBoost and ensemble modeling approaches. These encompass a range of indicators such as population size, marginality indices, age demographics, health, education, and income disparities. Together, they map broader social stress and exclusion dimensions, contextualizing state violence within enduring patterns of inequality and social fragility. The data suggest that in Mexico’s case, the interplay of socioeconomic deprivation and demographic pressures forms the core substrate upon which violent state actions are more likely to erupt.
Military and security variables in Mexico also register prominently, especially in the XGBoost framework, where aspects like the existence and duration of military operations, as well as local homicide rates, closely follow socioeconomic factors in their predictive import. This underscores the acute relevance of active security interventions and ambient violence levels in shaping the immediate risk environment for lethal state violence. However, the ensemble model gives greater emphasis to geographic and environmental factors, highlighting the spatial variability of these dynamics and reinforcing the idea that the physical and ecological context shapes patterns of governance and violence.
The divergence in how the different models weight military, geographic, and socioeconomic factors underscores the multifaceted nature of state violence in Mexico. It implies that while the presence of security forces and local violence is integral to understanding extrajudicial killings, the broader geographic context—encompassing terrain, rural-urban divides, and resource accessibility—is equally crucial. Such findings illuminate the ways in which logistical realities influence state capacity and behavior, affecting how and where state violence manifests.
Unlike Colombia, temporal trends in Mexico appear to play a subdued role in predicting extrajudicial killings. This relative stability suggests that structural and spatial factors maintain a persistent influence over time, implying a consistent risk landscape resistant to short-term fluctuations. This temporal constancy also hints at entrenched structural challenges that impair the state’s ability to reduce or prevent human rights violations despite potential policy or military interventions over the years analyzed.
Taken together, these analyses offer compelling evidence that predicting extrajudicial killings requires an integrative approach that moves beyond simplistic crime or conflict metrics. Instead, traits embedded in geography, social structure, and institutional contexts collectively shape the risk environments. The surprising lesser role of military presence in Colombia contrasts with its significant weight in Mexico, indicating regional variations in how state violence is operationalized and understood.
Moreover, the importance of geographic and environmental factors spotlights the critical role of accessibility, infrastructure, and regional isolation in mediating state action. Areas underserved by state presence—characterized by challenging terrain or remoteness—may create conditions conducive to unmonitored coercion and violence. These findings align with broader theory suggesting that logistical and environmental hurdles limit the state’s ability to enforce accountability and control in peripheral spaces.
The methodological innovation lies in the use of advanced machine learning algorithms like XGBoost and ensemble modeling to synthesize diverse data streams across time, space, and social indicators. By quantifying feature importance, these techniques provide a nuanced ranking of the variables most influential in predicting patterns of lethal violence, offering a powerful complement to traditional qualitative and econometric analyses.
These insights hold promise for future interdisciplinary research as well as practical policy design. Understanding that geography and socioeconomic deprivation are foundational predictors invites strategies aimed at enhancing infrastructural development, social inclusion, and geographic integration to mitigate the underlying risk factors. Furthermore, the limited predictive role of judicial and political institutional quality signals the need for strengthening these sectors to ensure mechanisms of accountability can translate into tangible violence reduction.
As researchers continue to explore these complex dynamics, there is a call for causal investigations to unpack mechanisms by which geographic isolation, social vulnerability, and military presence interact to drive state violence. Such endeavors could inform targeted interventions that interrupt these pathways, reducing extrajudicial killings and safeguarding human rights in fragile contexts.
In sum, this study represents a notable advance in empirical understandings of state violence, employing state-of-the-art machine learning tools to uncover patterns and predictors of extrajudicial killings in Colombia and Mexico. The findings articulate a layered reality in which environmental conditions, social inequalities, and security forces coalesce to shape the deadly contours of state coercion. This research not only enriches academic debates but also illuminates critical avenues for policymakers and human rights defenders seeking to stem the tide of violence in the region.
As machine learning continues to gain traction as a means of analyzing large complex datasets in social sciences, such integrative approaches can revolutionize how we understand and ultimately reduce systemic violence. By bridging computational techniques with human rights concerns, this research stands at the forefront of a transformative paradigm where data-driven insights guide more effective and just governance.
Subject of Research: Predicting state violence through machine learning models applied to extrajudicial killings in Colombia and Mexico.
Article Title: Predicting police and military violence: evidence from Colombia and Mexico using machine learning models.
Article References:
Gelvez, J.D. Predicting police and military violence: evidence from Colombia and Mexico using machine learning models.
Humanit Soc Sci Commun 12, 765 (2025). https://doi.org/10.1057/s41599-025-04967-w
Image Credits: AI Generated