Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive Debug Logs in recursive_serialize_or_str Function #4110

Closed
dhlidongming opened this issue Oct 11, 2024 · 3 comments · Fixed by #4132
Closed

Excessive Debug Logs in recursive_serialize_or_str Function #4110

dhlidongming opened this issue Oct 11, 2024 · 3 comments · Fixed by #4132
Labels
bug Something isn't working

Comments

@dhlidongming
Copy link
Contributor

Bug Description

I've noticed that the recursive_serialize_or_str function in schema/schema.py generates excessive debug-level logs when processing dictionaries with many key-value pairs. This excessive logging leads to increased execution time for some components.
The method performs recursive calls for dictionaries, and when handling strings, it checks for issubclass(BaseModel), which can throw an exception. Although the exception is caught and handled by treating the value as a string, the large number of keys in the dictionary can still result in a significant amount of debug logs being produced.

Reproduction

  1. Create a dictionary with a large number of key-value pairs, or use a component (such as API Request) that outputs a large dictionary.
  2. Call the recursive_serialize_or_str method with this dictionary.
  3. Monitor the logs generated during execution.

Expected behavior

Handle the str type correctly, or ensure that each dictionary generates only one log instead of one for each value.

Who can help?

@italojohnny @ogabrielluiz

Operating System

Ubuntu Linux 22.04

Langflow Version

1.0.19

Python Version

3.12

Screenshot

image

Flow File

No response

@dhlidongming dhlidongming added the bug Something isn't working label Oct 11, 2024
Copy link

dosubot bot commented Oct 11, 2024

Hey @dhlidongming! I'm here to help you with any bugs, questions, or contributions you have for Langflow. Let's work together to solve this issue.

To address the excessive debug logs in the recursive_serialize_or_str function and resolve the NameError for 'BaseModel', you can make the following changes:

  1. Optimize Logging:

    • Add conditional logging to capture only specific cases or errors, reducing the number of logs generated.
    import logging
    
    logger = logging.getLogger(__name__)
    
    def recursive_serialize_or_str(obj):
        try:
            if isinstance(obj, dict):
                return {k: recursive_serialize_or_str(v) for k, v in obj.items()}
            elif isinstance(obj, list):
                return [recursive_serialize_or_str(v) for v in obj]
            elif isinstance(obj, BaseModel):
                return {k: recursive_serialize_or_str(v) for k, v in obj.model_dump().items()}
            elif isinstance(obj, (AsyncIterator, Generator, Iterator)):
                return "Unconsumed Stream"
            return str(obj)
        except Exception as e:
            if isinstance(obj, dict):
                logger.debug(f"Cannot serialize dictionary with keys: {list(obj.keys())[:5]}...: {e}")
            else:
                logger.debug(f"Cannot serialize object {obj}: {e}")
            return str(obj)
  2. Fix NameError for 'BaseModel':

    • Ensure BaseModel is correctly imported from pydantic.
    from pydantic import BaseModel

These changes should help reduce excessive logging and fix the NameError [1].

To continue talking to Dosu, mention @dosu.

@codenprogressive
Copy link

codenprogressive commented Oct 13, 2024

I would like to report the same issue. The following line of code is issuing the following error: issubclass() arg 1 must be a class

if issubclass(obj, BaseModel):

Here is a simple flow to reproduce the error:
reproduce typeerror issubclass.json

screenshot of flow:
image

As highlighted by @dhlidongming, the impact of this bug is significantly slowing down the whole flow time execution when you have multiple items in a dictionary.

@ogabrielluiz
Copy link
Contributor

Hey @dhlidongming

We made some changes to the logging and that must have passed under the radar. Thanks for the report!

We'll fix it right away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants