Skip to content

fix: Avoid deep copying in ModelContainer to resolve pickling errors #559

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

laksi1999
Copy link

Description

This pull request introduces a new model_objects parameter in the ModelContainer class, allowing users to pass pre-created model objects directly. The primary change eliminates the need for deep copying model_object, which previously caused "cannot pickle" errors when dealing with non-serializable objects, such as those from Driverless AI. Instead, the updated implementation checks for a model_objects list and assigns a model instance from it, ensuring seamless integration with existing tools.

Motivation and Context

The original implementation relied on deep copying experiment objects, leading to serialization issues when handling non-picklable objects. This limitation prevented users from effectively integrating their models, particularly when using community-standard tools. By introducing the model_objects parameter, this update resolves the serialization issue while maintaining backward compatibility with the existing single model_object approach.

Type of Change

fix: A bug fix

How to Test

  1. Create a ModelContainer instance including the model_object parameter along with the new model_objects parameter with a pre-created list of model_object objects.
  2. Create a use case object that utilizes the ModelContainer instance.
  3. Run fairness and feature importance analysis for the created use case object.
  4. Validate that no "cannot pickle" errors occur when interacting with Driverless AI objects.
  5. Verify that the original functionality remains intact when only using the single model_object without the model_objects parameter.

Checklist

Please check all the boxes that apply to this pull request using "x":

  • I have tested the changes locally and verified that they work as expected.
  • I have added or updated the necessary documentation (README, API docs, etc.).
  • I have added appropriate unit tests or functional tests for the changes made.
  • I have followed the project's coding conventions and style guidelines.
  • I have rebased my branch onto the latest commit of the main branch.
  • I have squashed or reorganized my commits into logical units.
  • I have added any necessary dependencies or packages to the project's build configuration.
  • I have performed a self-review of my own code.
  • I have read, understood and agree to the Developer Certificate of Origin below, which this project utilises.

Screenshots (if applicable)

N/A

Additional Notes

This change ensures that non-picklable objects can be utilized without requiring workarounds, improving compatibility with community tools and frameworks.

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
   have the right to submit it under the open source license
   indicated in the file; or

(b) The contribution is based upon previous work that, to the best
   of my knowledge, is covered under an appropriate open source
   license and I have the right under that license to submit that
   work with modifications, whether created in whole or in part
   by me, under the same open source license (unless I am
   permitted to submit under a different license), as indicated
   in the file; or

(c) The contribution was provided directly to me by some other
   person who certified (a), (b) or (c) and I have not modified
   it.

(d) I understand and agree that this project and the contribution
   are public and that a record of the contribution (including all
   personal information I submit with it, including my sign-off) is
   maintained indefinitely and may be redistributed consistent with
   this project or the open source license(s) involved.

@imda-benedictlee
Copy link
Contributor

imda-benedictlee commented May 27, 2025

Hi @laksi1999 Thanks for the PR contribution. I will take a look at the PR together with the team.

@imda-benedictlee
Copy link
Contributor

imda-benedictlee commented May 27, 2025

Hi @laksi1999, can I check if you have sample non-serialisable objects to provide for testing this PR? Also, would like to check if this is run via the Notebook and CLI Standalone Plugin?

@imda-benedictlee
Copy link
Contributor

Unit Test for Veritas Passed
unit-test.txt

Veritas CLI Ran Successfully
cli-test.txt

Veritas CLI Docker Ran Successfully
cli-test-docker.txt

Veritas Example Notebooks Ran Successfully
CM_output.zip
PUW_output.zip
CS_output.zip
BaseRegression_output.zip
BaseClassification_output.zip

@laksi1999
Copy link
Author

@imda-benedictlee Thanks for reviewing the PR.

I’ve attached a notebook that reproduces the error encountered when calling feature_importance() from the aiverify_veritastool:
TypeError: cannot pickle 'daimojo.cppmojo.model' object.

In this notebook, I used a MOJO model. I’ve also included the corresponding model file and the datasets used. Please see the attached Veritas_artifacts.zip

However, since you won’t be able to test the MOJO model in your environment due to authentication restrictions, I’ve also attached a scikit-learn model for testing purposes. Along with that, I’ve included two notebooks:

  • The first shows how the feature_importance() function fails with the current implementation due to a deep copy issue.
  • The second shows how this PR resolves that issue using a modified version of the aiverify_veritastool.

Please see the attached Veritas_artifacts_scikit_learn.zip for the artifacts.

@imda-benedictlee
Copy link
Contributor

Hi @laksi1999, thanks for the help on this. I will take a look and review the PR.

Copy link
Collaborator

@timlrx timlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The leave one out calculation in the feature importance method uses a ThreadPoolExecutor to run multiple threads concurrently, each analysing a different protected variable. Without deep copying there might be race conditions.

I get that allowing a model_objects field would bypass the serialization but it introduces other issues:

  • How do we know the number of model_objects to pass?
  • The second thread onwards gets a different model_object or fallback to deepcopy
  • There might also be a race condition with multiple threads calling pop on the same list.

Open to suggestions on how to avoid serialization but we need to create multiple identical copies for parallel processing, not to manage a pool of different model objects.

Comment on lines +4203 to +4205
model_obj = getattr(model_param, "model_object", None)
if hasattr(model_param, 'model_objects') and model_param.model_objects:
model_object = model_param.model_objects.pop(0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is passed in model_objects? How do we ensure that there is a sufficient number of model_objects to for the threads to access?

@laksi1999
Copy link
Author

@timlrx Based on your feedback, I propose using a lightweight ModelObjectsWrapper, which provides a pop() method that loads a new model instance from disk on demand.

from abc import ABC, abstractmethod
from typing import Any

class ModelObjectsWrapper(ABC):
    @abstractmethod
    def pop(self, index) -> Any:
        """Return a new model object instance."""
        pass

For example, We can use it like below:

import joblib

class MyModelObjectsWrapper(ModelObjectsWrapper):
    def __init__(self, path: str):
        self._path = path

    def pop(self, index) -> Any:
        return joblib.load(self._path)

model_objects = MyModelObjectsWrapper("/path/to/model")

Why This Approach Solves the Problem:

  • No need to predefine the number of models: Each thread gets a fresh model by calling pop(), so thread count doesn’t matter.
  • No shared list = no race condition: There's no shared mutable state; each call is independent.
  • Avoids deepcopy fallback: Models are loaded cleanly, avoiding serialization issues.

I’ve attached the notebook with the current solution. veritas_with_iris_dataset_with_success.zip

The pop method uses an index because the earlier implementation relied on index. If the current proposal makes sense, I can simplify the code by using pop() and removing the index logic.

Please let me know your thoughts on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants