feat(integrations): Slack #353

zereraz · 2025-03-23T07:59:41Z

Description

Add slack integration

Testing

Additional Notes

Summary by CodeRabbit

New Features
- Enhanced autocomplete now offers chat user suggestions and displays chat message results.
- New integrations sidebar streamlines navigation for Google and Slack integrations.
- API key creation is now available for simplified connector management.
Enhancements
- Refined search capabilities now include richer chat data.
- Updated UI elements with improved active state styling and loading indicators.
Infrastructure Improvements
- Expanded backend support for Slack integrations and chat data, with updated configurations for a smoother experience.

gemini-code-assist

Hello @zereraz, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request introduces Slack integration to the application, allowing users to connect their Slack workspace and ingest data. It includes changes to the frontend to display Slack integration options and to handle Slack message search results. The PR also modifies the backend to handle Slack OAuth and API key authentication, fetch data from Slack, and index it in Vespa.

Highlights

Frontend Changes for Slack Integration: The frontend is updated to include a Slack integration option in the sidebar and integration settings page. New components like IntegrationsSidebar and modifications to OAuthTab are added to manage the Slack integration flow.
Backend Implementation for Slack: The backend is extended to support Slack OAuth and API key authentication. New API endpoints are created to handle OAuth callbacks and API key submissions. The code also includes the logic to fetch data from Slack and index it in Vespa.
Vespa Schemas and Indexing: New Vespa schemas are defined for Slack data, including chat_message, chat_container, chat_user, and chat_team. The indexing logic is updated to handle these new schemas, allowing users to search for Slack messages and related data.
Data Ingestion and Tracking: The pull request implements data ingestion from Slack using the Slack API. It includes logic to fetch conversations, messages, and users, and to track the progress of the ingestion process. New types and functions are added to support Slack-specific data structures and operations.
Route Tree and Navigation: The route tree is updated to include new routes for Slack integration, both for regular users and admin users. The navigation logic is modified to handle these new routes, ensuring that users can access the Slack integration settings page.

Changelog

Click here to see the changelog

frontend/src/assets/google-logo.svg
- Added a Google logo SVG file.
frontend/src/assets/slack.svg
- Added a Slack logo SVG file.
frontend/src/components/Autocomplete.tsx
- Added ChatUserAutocompleteElement component to display chat user autocomplete results.
- Modified AutocompleteElement to handle chat_user result type.
frontend/src/components/GroupFilter.tsx
- Added SlackEntity to the entity enum.
- Updated getName function to handle SlackEntity.Message.
frontend/src/components/HistoryModal.tsx
- Replaced import of LoaderContent from admin integrations with a generic one from lib/common.tsx.
frontend/src/components/IntegrationsSidebar.tsx
- Created a new component IntegrationsSidebar to display integration options.
frontend/src/components/OAuthTab.tsx
- Updated imports to reflect the new location of OAuthButton, OAuthForm and OAuthIntegrationStatus.
frontend/src/components/SearchResult.tsx
- Added a new search result type chat_message to display Slack messages.
- Updated the SearchResult component to handle chat_message results.
frontend/src/components/Sidebar.tsx
- Updated the integrations link to highlight when on integrations pages.
frontend/src/lib/common.tsx
- Added SlackSvg import.
- Updated getIcon function to handle Apps.Slack.
- Moved LoaderContent to lib/common.tsx.
frontend/src/routeTree.gen.ts
- Updated route tree to include new routes for Slack integration, both for regular users and admin users.
frontend/src/routes/_authenticated/admin/integrations/google.tsx
- Updated imports to reflect the new location of OAuthButton, OAuthForm and OAuthIntegrationStatus.
- Moved LoaderContent and OAuthIntegrationStatus to lib/common.tsx and types.ts respectively.
- Added IntegrationsSidebar component.
frontend/src/routes/_authenticated/admin/integrations/index.tsx
- Created a new route component to redirect to Google integrations page.
frontend/src/routes/_authenticated/admin/integrations/slack.tsx
- Created a new route component for Slack integration settings.
frontend/src/routes/_authenticated/integrations/google.tsx
- Updated imports to reflect the new location of OAuthButton, OAuthForm and OAuthIntegrationStatus.
- Added IntegrationsSidebar component.
frontend/src/routes/_authenticated/integrations/index.tsx
- Created a new route component to redirect to Google integrations page.
frontend/src/routes/_authenticated/integrations/slack.tsx
- Created a new route component for Slack integration settings.
frontend/src/types.ts
- Added OAuthIntegrationStatus enum.
server/ai/context.ts
- Added chatMessageSchema import.
- Added VespaChatMessageSearch type import.
- Added constructSlackMessageContext function.
- Updated answerContextMap function to handle chatMessageSchema.
server/ai/mappers.ts
- Updated Claude model names.
server/api/admin.ts
- Added slackHost import from config.
- Added Slack import from arctic.
- Updated getAuthorizationUrl function to handle Slack OAuth.
- Updated CreateOAuthProvider function to handle Slack OAuth.
- Added AddApiKeyConnector function to handle Slack API key authentication.
server/api/chat.ts
- Added chatMessageSchema import.
- Added VespaChatMessage type import.
- Updated searchToCitation function to handle chatMessageSchema.
server/api/oauth.ts
- Added Slack import from arctic.
- Added SlackOAuthResp interface.
- Updated OAuthCallback function to handle Slack OAuth.
server/config.ts
- Added slackHost config variable.
server/db/connector.ts
- Added apiKey parameter to insertConnector function.
server/db/schema.ts
- Added apiKey column to connectors table.
server/integrations/google/gmail-worker.ts
- Updated import for StatType from integrations/tracker.ts.
server/integrations/google/gmail/index.ts
- Updated import for StatType and updateUserStats from integrations/tracker.ts.
server/integrations/google/index.ts
- Updated imports to reflect the new location of wsConnections and sendWebsocketMessage.
- Updated imports to reflect the new location of getProgress, markUserComplete, oAuthTracker, serviceAccountTracker, setOAuthUser, setTotalUsers, StatType, updateUserStats.
- Updated handleGoogleOAuthIngestion and handleGoogleServiceAccountIngestion functions to use the new tracking system.
server/integrations/google/tracking.ts
- Removed the tracking system and related functions.
server/integrations/metricStream.ts
- Created a new file to handle websocket connections and messages.
server/integrations/slack/index.ts
- Created a new file to handle Slack integration logic.
server/migrations/0002_natural_arclight.sql
- Added slack value to app_type enum.
- Added api_key column to connectors table.
server/migrations/0003_jittery_speed.sql
- Updated foreign key constraint for oauth_providers table.
server/search/mappers.ts
- Added chatMessageSchema import.
- Added VespaChatMessageSearch type import.
- Updated VespaSearchResponseToSearchResult function to handle chatMessageSchema.
server/search/types.ts
- Added new types and enums for Slack integration.
- Updated VespaSearchFieldsUnionSchema to include new schemas for Slack data.
server/search/vespa.ts
- Added chatMessageSchema import.
- Updated insert function to handle new schemas for Slack data.
- Updated autocomplete function to include Slack data.
server/search/vespaClient.ts
- Updated insert function to handle new schemas for Slack data.
server/server.ts
- Added addApiKeyConnectorSchema import.
- Added AddApiKeyConnector import.
- Added new route for Slack API key authentication.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Did you know?

Slack was initially developed as an internal tool for a video game company that was working on a game that ultimately wasn't released.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces Slack integration and enhances Google integration, adding new features and improving existing ones. The code changes are extensive, covering frontend components, backend APIs, and database schema modifications. While the overall implementation seems well-structured, there are areas where additional documentation, testing, and error handling could improve the code's quality and maintainability.

Summary of Findings

Missing documentation: The pull request lacks a description of the changes, testing information, and additional notes. Please provide this information to help reviewers understand the purpose and impact of the changes.
OAuth integration: The OAuth integration implementation appears complex, involving multiple states and components. Ensure that the integration flow is well-tested and handles various scenarios, including error cases and user interactions.
Potential rate limiting issues: The Slack integration involves fetching messages and replies, which could lead to rate limiting issues with the Slack API. Implement appropriate error handling and retry mechanisms to mitigate this risk.

Merge Readiness

The pull request introduces significant new functionality and touches many parts of the codebase. Due to the high and critical severity issues identified, I recommend that the author address these issues before merging. I am unable to approve this pull request, and users should have others review and approve this code before merging.

coderabbitai · 2025-03-24T10:16:38Z

Walkthrough

The changes span both the frontend and backend. On the frontend, new UI components (e.g., chat user autocomplete element and integrations sidebar) have been added, and existing components enhanced with additional conditional rendering and styling. Route definitions and import paths have been updated to support distinct integration flows. On the backend, OAuth handling has been extended to support Slack alongside Google, a new API key connector endpoint has been introduced, and several database schema migrations and configuration updates have been applied. Additional updates include new search mappers, Vespa schema definitions for chat data, and a new tracking system for integration statistics.

Changes

File(s)	Change Summary
`frontend/src/components/Autocomplete.tsx`, `frontend/src/components/GroupFilter.tsx`, `frontend/src/components/HistoryModal.tsx`, `frontend/src/components/IntegrationsSidebar.tsx`, `frontend/src/components/OAuthTab.tsx`, `frontend/src/components/SearchResult.tsx`, `frontend/src/components/Sidebar.tsx`	Added new UI elements: chat user autocomplete, chat message rendering, integration sidebar; updated conditional rendering and styling; modified import paths.
`frontend/src/lib/common.tsx`, `frontend/src/routeTree.gen.ts`, `frontend/src/routes/_authenticated/**`	Introduced new route definitions (Google, Slack, admin integrations), updated redirection logic and import paths; added library enhancements such as a new loader and constant definitions.
`server/api/admin.ts`, `server/api/chat.ts`, `server/api/oauth.ts`, `server/server.ts`, `server/queue/index.ts`, `server/utils.ts`	Extended OAuth flow to handle Slack; introduced API key connector method; updated WebSocket imports; added job handling logic for Slack ingestion.
`server/.env.development`, `server/config.ts`, `server/migrations/**`, `server/package.json`	Added a new environment variable for encryption; modified database schemas by adding API key columns, updating enums and foreign keys; introduced a Slack dependency.
`server/search/mappers.ts`, `server/search/types.ts`, `server/vespa/vespa.ts`, `server/vespa/vespaClient.ts`, and various Vespa schema files (`*.sd`) plus `server/vespa/services.xml`	Expanded search mapping to include chat message and user schemas; standardized tensor dimensions to 384 across schemas; added new document types for chat data and updated ranking profiles.
`server/integrations/slack/index.ts`, `server/integrations/tracker.ts`	Introduced comprehensive Slack integration functions and a new Tracker class to handle user statistics across integrations.

Sequence Diagram(s)

Slack OAuth Callback Flow

sequenceDiagram
    participant U as User
    participant C as Client Application
    participant O as OAuthCallback Handler
    participant S as Slack API
    U->>C: Initiates Slack OAuth flow
    C->>O: Redirects with Authorization Code
    O->>S: POST auth code & credentials to Slack OAuth endpoint
    S-->>O: Returns token data (accessToken, team info, etc.)
    O->>DB: Update connector with Slack tokens
    O->>C: Redirect to Slack success URL

API Key Connector Creation Flow

sequenceDiagram
    participant Cl as Client
    participant SV as API Endpoint (/apikey/create)
    participant V as Validator (zValidator)
    participant A as AddApiKeyConnector Handler
    Cl->>SV: Sends API key creation request
    SV->>V: Validates request payload
    V-->>SV: Returns validated data
    SV->>A: Processes API key connector creation
    A-->>SV: Returns connector confirmation
    SV->>Cl: Sends success response

Poem

I'm a rabbit in the code meadow,
Hopping through new routes and bright UI glows.
Slack OAuth and API keys now lead the way,
With schemas and tracks keeping bugs at bay.
My ears twitch as modules grow,
In a garden of code where fresh features flow.
🐇🌱 Happy coding, let the changes play!

✨ Finishing Touches

📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai plan to trigger planning for file edits and PR creation.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Caution

Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 19

🔭 Outside diff range comments (4)

server/.env.development (1)

1-1: ⚠️ Potential issue

Security concern: Encryption key in version control

Storing encryption keys (even for development environments) in version control is a security risk. This could potentially be exposed if the repository is ever made public or accessed by unauthorized individuals.

Consider:

Using a secure vault or environment-specific secrets management

Providing a template file (.env.example) with placeholders instead

Adding .env.* files to .gitignore

Additionally, check if the "+" at the beginning of the key is intentional or a typo.

server/db/connector.ts (1)

42-62: 🛠️ Refactor suggestion

Securely handle apiKey storage.
Including the apiKey in plain text could pose security risks. If it needs to be stored, consider encrypting it, using a key management system, or ensuring proper masking and redaction in logs.

server/integrations/slack/index.ts (1)

586-777: 🛠️ Refactor suggestion

Provide transaction or rollback logic on partial failures.
handleSlackIngestion has multiple points of failure. Consider wrapping more DB operations in a transaction to ensure consistency if an exception is thrown mid-process.

server/integrations/google/index.ts (1)

1-2156: 🛠️ Refactor suggestion

Synchronize partial failures in insertFilesForUser().
At lines ~1165-1258, an error in PDF ingestion or docs retrieval logs the error but continues. You might lose track of failed items. Consider a local retry or collect a final report of skipped documents to requeue or notify.

🧹 Nitpick comments (62)

server/search/types.ts (2)

79-85: New SlackEntity enum might risk overlap with other “file” entity.

Defining SlackEntity is beneficial for segregating Slack “team,” “user,” “message,” “channel,” and “file.” However, note there is also a DriveEntity.File in the codebase. This is likely fine if you consistently differentiate the Slack domain from Drive. Otherwise, consider renaming for clarity.

862-885: New ChatMessageResponseSchema for Slack messages.

This schema neatly mirrors the stored fields in VespaChatMessageSchema. The commented-out relevance field suggests an intentional omission. Verify whether Slack search results should display or filter by relevance for consistency.
frontend/src/routes/_authenticated/admin/integrations/index.tsx (1)
8-16: Consider adding loading state and error handling

The component correctly redirects to the Google integrations page, but lacks any loading state or error handling during navigation.

Consider adding:
function RouteComponent() {
  const navigate = useNavigate()
+ const [isNavigating, setIsNavigating] = useState(true)
+ const [error, setError] = useState<Error | null>(null)

  useEffect(() => {
-   navigate({ to: "/admin/integrations/google", replace: true })
+   navigate({ to: "/admin/integrations/google", replace: true })
+     .catch(err => {
+       setError(err)
+     })
+     .finally(() => {
+       setIsNavigating(false)
+     })
  }, [navigate])

- return <></>
+ if (error) return <div>Error navigating: {error.message}</div>
+ if (isNavigating) return <div>Redirecting...</div>
+ return <></>
}
server/types.ts (1)
75-80: Add validation and documentation for the API key schema

The schema structure looks good but lacks validation for API key format and documentation.

Consider enhancing it with:
+/**
+ * Schema for creating a connector using an API key
+ * Used for integrations like Slack that support API key authentication
+ */
export const addApiKeyConnectorSchema = z.object({
  app: z.nativeEnum(Apps),
-  apiKey: z.string(),
+  apiKey: z.string().min(1, "API key cannot be empty")
+    .refine(key => !key.includes(" "), "API key should not contain spaces"),
})

+/**
+ * Type representing an API key connector configuration
+ */
export type ApiKeyConnector = z.infer<typeof addApiKeyConnectorSchema>
This adds basic validation and documentation to improve code quality and maintainability.
frontend/src/components/SearchResult.tsx (4)
143-144: Remove console.log statement before pushing to production.

The console.log statement on line 144 should be removed as it's likely a debugging artifact that shouldn't be included in production code.
-  } else if (result.type === "chat_message") {
-    console.log(result)
+  } else if (result.type === "chat_message") {
146-156: LGTM, but consider adding null checks for Slack URLs.

The JSX structure for rendering Slack message links follows consistent patterns with other result types, which is good. However, consider adding null or undefined checks for the URL properties (domain, channelId, createdAt) to prevent runtime errors if these values are missing.
        <div className="flex items-center justify-start space-x-2">
            <a
-              href={`https://${result.domain}.slack.com/archives/${result.channelId}/p${result.createdAt}`}
+              href={result.domain && result.channelId && result.createdAt 
+                ? `https://${result.domain}.slack.com/archives/${result.channelId}/p${result.createdAt}` 
+                : "#"}
              target="_blank"
              rel="noopener noreferrer"
              className="flex items-center text-blue-800 space-x-2"
            >
157-173: Add fallback for missing user image.

There's no fallback handling if the user image is missing. Consider adding a fallback like other cases in this component.
        <div className="flex flex-row items-center mt-1 ml-[44px]">
          <img
            referrerPolicy="no-referrer"
            className="mr-2 w-[16px] h-[16px] rounded-full"
-            src={result.image}
+            src={result.image || "/default-avatar.png"}
            alt={`${result.name}'s profile`}
          ></img>
          <a
            target="_blank"
            className="text-[#2067F5]"
            rel="noopener noreferrer"
-            href={`https://${result.domain}.slack.com/team/${result.userId}`}
+            href={result.domain && result.userId 
+              ? `https://${result.domain}.slack.com/team/${result.userId}` 
+              : "#"}
          >
174-176: Consider adding a consistent line limit for message text.

The message text is displayed without any line clamping, which might lead to overly long messages taking up too much space in the search results. Consider adding the same line-clamp class that's used for other result types.
-        {result.text && <HighlightedText chunk_summary={result.text} />}
+        {result.text && (
+          <div className="text-left line-clamp-[2.5] text-ellipsis overflow-hidden">
+            <HighlightedText chunk_summary={result.text} />
+          </div>
+        )}
frontend/src/routes/_authenticated/integrations/index.tsx (1)
8-16: Add a loading state during redirect.

The component automatically redirects to Google integrations, but doesn't show any loading indicator during the navigation. Consider adding a loading state to improve user experience.
function RouteComponent() {
  const navigate = useNavigate()
+  const [isLoading, setIsLoading] = useState(true)

  useEffect(() => {
    navigate({ to: "/integrations/google", replace: true })
+      .finally(() => setIsLoading(false))
  }, [navigate])

-  return <></>
+  return (
+    <>
+      {isLoading && (
+        <div className="flex justify-center items-center h-screen">
+          <p>Redirecting to integrations...</p>
+        </div>
+      )}
+    </>
+  )
}
Don't forget to import useState:
-import { createFileRoute, useNavigate } from "@tanstack/react-router"
-import { useEffect } from "react"
+import { createFileRoute, useNavigate } from "@tanstack/react-router"
+import { useEffect, useState } from "react"
server/vespa/validation-overrides.xml (2)
4-5: Clarify duplicate field-type-change override.

There's already an existing field-type-change override until 2025-02-25 on line 3. Having two separate overrides for the same rule with different expiration dates may cause confusion. Consider removing the older entry if it's being replaced.
<validation-overrides>
  <allow until="2025-02-25">indexing-change</allow>
-   <allow until='2025-02-25'>field-type-change</allow>
   <allow until='2025-03-25'>field-type-change</allow>
   <allow until='2025-03-25'>global-document-change</allow> 
</validation-overrides>
5-5: Document the reason for global-document-change override.

Adding a global-document-change override suggests significant schema changes. Consider adding a comment explaining the purpose of this override to help future developers understand why it was necessary.
-   <allow until='2025-03-25'>global-document-change</allow> 
+   
+   <allow until='2025-03-25'>global-document-change</allow> 
server/api/chat.ts (1)
293-299: New chat message citation handling

The added conditional block properly extends the citation functionality to chat messages, maintaining consistency with the existing citation handling for other entity types.

However, the static URL https://google.com doesn't seem to point to a meaningful location for a chat message. Consider using a more relevant URL or making it dynamic based on the message context.
  } else if (result.fields.sddocname === chatMessageSchema) {
    return {
      title: (fields as VespaChatMessage).text,
-     url: `https://google.com`,
+     url: `${window.location.origin}/chat/${(fields as VespaChatMessage).chatId}`,
      app: (fields as VespaChatMessage).app,
      entity: (fields as VespaChatMessage).entity,
    }
frontend/src/routes/_authenticated/integrations/slack.tsx (1)
8-14: Implement or remove commented-out admin redirection logic.

The commented-out code appears to be intended to redirect admin users away from this route, but it's currently inactive. Either implement this functionality properly or remove the commented code.

If this redirection logic is needed:
- // Admins should be redirected to visit /admin/integrations
- // if (
- //   userWorkspace?.user?.role === UserRole.SuperAdmin ||
- //   userWorkspace?.user?.role === UserRole.Admin
- // ) {
- //   throw redirect({ to: '/admin/integrations/' })
- // }
+ // Admins should be redirected to visit /admin/integrations
+ if (
+   userWorkspace?.user?.role === UserRole.SuperAdmin ||
+   userWorkspace?.user?.role === UserRole.Admin
+ ) {
+   throw redirect({ to: '/admin/integrations/' })
+ }
Don't forget to import the necessary UserRole enum and redirect function.
server/ai/context.ts (2)
82-93: Standardize formatting in the constructSlackMessageContext function.

The template string formatting in this function contains inconsistent indentation that doesn't match the pattern used in other similar functions in this file. Also, check for optional fields like in other context constructors.
const constructSlackMessageContext = (
  fields: VespaChatMessageSearch,
  relevance: number,
): string => {
  return `App: ${fields.app}
-    Entity: ${fields.entity}
-    User: ${fields.name}
-    Username: ${fields.username}
-    Message: ${fields.text}
-    vespa relevance score: ${relevance}
-    `
+Entity: ${fields.entity}
+${fields.name ? `User: ${fields.name}` : ""}
+${fields.username ? `Username: ${fields.username}` : ""}
+${fields.text ? `Message: ${fields.text}` : ""}
+vespa relevance score: ${relevance}`
}
348-353: Clarify comment and add error handling for Slack message context.

The comment "// later can be based on app" is vague and suggests temporary implementation. Additionally, ensure that this new conditional branch handles type mismatches appropriately.
else if (searchResult.fields.sddocname === chatMessageSchema) {
-  // later can be based on app
+  // TODO: Refine message context based on specific chat app (Slack, MS Teams, etc.)
+  if (!isValidChatMessageFields(searchResult.fields)) {
+    throw new Error(
+      `Invalid chat message fields: ${JSON.stringify(searchResult.fields)}`,
+    )
+  }
  return constructSlackMessageContext(
    searchResult.fields,
    searchResult.relevance,
  )
}
Add a helper function to validate chat message fields:
function isValidChatMessageFields(fields: any): fields is VespaChatMessageSearch {
  return fields.app && fields.entity && fields.text;
}
server/queue/index.ts (2)
87-92: Use strict equality operator for consistency.

The code uses loose equality (==) for comparing jobData.app with Apps.Slack while strict equality (===) is used elsewhere in the file. This inconsistency could lead to unexpected behavior.
else if (
-  jobData.app == Apps.Slack &&
+  jobData.app === Apps.Slack &&
  jobData.authType === AuthType.OAuth
) {
  await handleSlackIngestion(boss, job)
}
94-96: Remove or implement commented code for API key auth.

The commented code for API key authentication creates ambiguity about whether this feature is planned for future implementation or was intentionally disabled. Either remove it if not needed or implement it properly.
-// else if (jobData.app == Apps.Slack && jobData.authType === AuthType.ApiKey) {
-//   await handleSlackIngestion(boss, job)
-// }
If this functionality is planned for a future release, add a TODO comment with more details:
+// TODO: Implement API key auth for Slack integration in future release (planned for v2.x)
server/integrations/metricStream.ts (1)
9-14: Enhance error handling and add type safety to sendWebsocketMessage.

The function doesn't handle potential errors when sending messages and doesn't validate message content, which could lead to runtime errors if the message is malformed or the WebSocket is in an invalid state.
-export const sendWebsocketMessage = (message: string, connectorId: string) => {
+export const sendWebsocketMessage = (message: string, connectorId: string): boolean => {
  const ws: WSContext = wsConnections.get(connectorId)
  if (ws) {
-    ws.send(JSON.stringify({ message }))
+    try {
+      ws.send(JSON.stringify({ message }))
+      return true
+    } catch (error) {
+      console.error(`Failed to send WebSocket message for connector ${connectorId}:`, error)
+      return false
+    }
  }
+  return false
}
frontend/src/components/Sidebar.tsx (1)

83-83: Simplified Plug icon properties

The icon properties have been simplified to use consistent sizing and color values. This change improves code maintainability.

Consider using design tokens or theme variables instead of hardcoded color values (#384049) for better consistency across the application.
server/vespa/schemas/mail_attachment.sd (1)
66-66: Consider using a constant instead of hardcoding the embedding dimension value.

Hardcoding the dimension value (384) across multiple locations reduces flexibility. If the embedding size changes in the future, you'll need to update multiple files.

Consider defining a constant at the schema level or using a schema parameter:
-field chunk_embeddings type tensor<bfloat16>(p{}, v[384])  {
+field chunk_embeddings type tensor<bfloat16>(p{}, v[EMBEDDING_DIM])  {
With a schema declaration like:
constant EMBEDDING_DIM = 384
Also applies to: 73-73, 102-102
frontend/src/components/IntegrationsSidebar.tsx (1)
7-55: Clean up commented code and consider extracting styles for better maintainability.

The component implementation is solid, but there are a few things to improve:

There's commented-out code in the Slack integration navigation logic (lines 39-42)

The inline Tailwind styles could be extracted for better readability

Consider cleaning up the commented code and extracting common styles:
// Clean up commented code
onClick={() => {
  router.navigate({
    to: "/integrations/slack",
  })
}}

// Extract common styles to a variable
const listItemBaseClass = "group flex justify-between items-center hover:bg-[#EBEFF2] rounded-[6px] pt-[8px] pb-[8px] ml-[8px] mr-[8px]"
server/search/vespaClient.ts (2)
181-185: Avoid logging full error body if it might contain PII or secrets.
While logging errorBody is helpful for debugging, the response might include sensitive information. Consider sanitizing or partially redacting it if there's a risk of leaking credentials or PII.

194-199: Consolidate the redundant if-else block.
Because the code already throws an error when !response.ok, this else branch is never reached in an error scenario and is empty. Removing it keeps the code cleaner:
       const data = await response.json()
-      if (response.ok) {
-        // Logger.info(`Document ${document.docId} inserted successfully`)
-      } else {
-      }
+      // Logger.info(`Document ${document.docId} inserted successfully`)
server/search/vespa.ts (1)

296-305: Extended block for semantic or userInput queries.
This repeated pattern might benefit from a helper function for constructing typed nearestNeighbor or userInput blocks, but it’s not critical.

server/api/oauth.ts (6)

35-36: Console logs in production code.
Consider removing or downgrading these console.log statements to debug logs in production for cleanliness.

48-49: Debugging state checks.
Again, these console.log statements are helpful for development but can be removed or replaced with structured debug logs if you’re no longer actively troubleshooting.

57-57: Verifying code verifier logs.
Logging sensitive PKCE data might be risky. Consider removing to avoid accidental exposure.

65-65: Logging provider details.
Consider removing or limiting the scope of these logs if provider might contain sensitive data.

107-107: Graceful handling of invalid app type.
Throwing an HTTP 500 for unknown apps is acceptable but consider a 400 if this can occur from erroneous client input.

112-112: Storing OAuth credentials.
Serializing tokens is straightforward. Double-check that these stored tokens are encrypted at rest or masked in logs to ensure security.

Would you like a quick snippet or a new issue to track encryption at rest for these credentials?

server/vespa/schemas/chat_team.sd (2)

7-9: Consider using reference field for team membership.

The docId field is properly defined for indexing, but consider adding a reference field in related schemas (e.g., chat_message) to establish proper relationships between teams and messages/users.

65-69: Consider enhancing the ranking profile.

The current rank profile uses only nativeRank, which is basic. As the application evolves, you might want to add more sophisticated ranking factors such as recency or relevance components similar to what's used in the file schema.

frontend/src/routes/_authenticated/admin/integrations/google.tsx (3)

220-226: Avoid using any in function props.
Using any for onSuccess and refetch reduces type safety. Consider specifying stronger types to enhance maintainability.

326-333: Replace any type in OAuthButton.
The setOAuthIntegrationStatus callback can benefit from a more specific type to prevent confusion and runtime errors.

382-388: Use typed structures for userStats.
Rather than { [email: string]: any }, define an interface to represent user stats. Stronger typing improves readability and maintainability.

server/integrations/slack/index.ts (4)

46-60: Rate-limit handling for user listing.
While the cursor approach repeats until exhausted, consider adding rate-limit checks or delays if workspace has many users.

62-70: Surface errors from getTeamInfo for better troubleshooting.
In addition to throwing a generic error, consider logging relevant Slack error codes or diagnosing common failure causes.

73-97: Potential performance optimization for large workspaces.
When encountering thousands of conversations, repeated calls can be costly. Evaluate concurrency or chunk-based ingestion while respecting Slack rate limits.

178-193: Consider caching member lookup for mentions.
Repeatedly fetching user info can quickly use Slack API rate limits. Caching or a single pass might improve performance.
frontend/src/routes/_authenticated/admin/integrations/slack.tsx (4)
38-59: Consider using structured JSON for error responses.
Currently, the code reads the error text via await response.text() and throws a string. If the backend typically provides structured JSON errors, parsing the response as JSON and handling error fields individually can improve clarity and error handling downstream.

237-293: Add token format validation for immediate feedback.
Slack Bot tokens often start with “xoxb-”. Basic validation in the form could provide early warning if the format is incorrect, saving debugging time.
+// Example of a minimal token format check before submission:
+const isSlackBotToken = (token: string) => token.startsWith("xoxb-");
+
const form = useForm<{ botToken: string }>({
  ...
  onSubmit: async ({ value }) => {
    + if (!isSlackBotToken(value.botToken)) {
    +   toast({
    +     title: "Invalid Bot Token",
    +     description: "Slack Bot tokens typically start with xoxb-",
    +     variant: "destructive",
    +   });
    +   return;
    + }
    ...
  }
});
428-474: Add a reconnection strategy for the WebSocket.
If the WebSocket closes prematurely or due to temporary network issues, consider automatically reconnecting after a brief delay. This ensures a more robust experience when retrieving Slack integration updates.

532-536: Clarify instructions for user and bot token differences.
Offering more guidance on when to use each token type can reduce confusion. For example, a short note or tooltip describing each token’s role can help first-time users.
server/vespa/schemas/chat_container.sd (1)

87-92: Evaluate performance overhead of embeddings.
Storing large tensor<bfloat16>(v[384]) embeddings can be computationally intensive, especially for frequent updates. Verify that you have enough resources and that it fits the usage patterns for realtime or near-realtime queries.

server/vespa/schemas/chat_user.sd (1)

74-81: Assess fuzzy matching performance.
Applying gram-size: 3 on large user datasets can be expensive. Consider indexing usage patterns or limiting n-gram fields if performance becomes an issue, or look into advanced solutions like partial normalization or prefix queries.
frontend/src/routeTree.gen.ts (2)
65-71: Check Trailing Slash Consistency.
Noticing '/integrations/' includes a trailing slash while the matching path registration (line 182) omits it. This mismatch might cause routing inconsistencies.
-    path: '/integrations/',
+    path: '/integrations',
92-98: Check Trailing Slash for Admin Integrations Index.
Similar to the earlier comment about trailing slashes, verify consistency for '/admin/integrations/'.
-    path: '/admin/integrations/',
+    path: '/admin/integrations',
server/integrations/tracker.ts (3)

79-138: initializeUserStats Handles Google/Slack Apps.

Good approach using IsGoogleApp(this.app).

The method is flexible: you can extend for more apps.

Consider removing or refactoring repeated code if more apps are added (DRY principle).

140-153: updateUserStats Uses Dynamic Property Access.

The as any cast bypasses compiler checks—potential type-safety risk.

The logic otherwise is fine for incrementing existing counters.

Consider refining the type definitions to avoid the any cast for safer updates.

195-207: Commented-Out Code is Obsolete.
Removing it fully might simplify maintenance, though it can remain for reference if needed.

server/vespa/schemas/chat_message.sd (3)

46-48: Commented-Out messageType Field.
If no immediate usage, removing it or finalizing might avoid confusion.

58-60: Commented-Out userRef Field.
Remove or finalize as needed to keep the schema minimal.

83-85: Commented-Out reactions Field.
Removing or finishing this field helps clarity.

server/vespa/schemas/chat_attachment.sd (2)

1-136: Consider validating field lengths for textual fields.
While string-typed fields like title, filename, and chunks are indexed with BM25, there's no explicit mention of validating or restricting the maximum length of those fields before indexing. Extremely large strings may degrade indexing performance.

1-136: Review the rank expressions for relevancy.
The rank profile’s expressions combine vector scoring with BM25 in both the first-phase and global-phase. While this is a valid approach, consider refining weighting or normalizing the BM25 factors differently if vector relevance dominates or if textual matching is a strong signal.

server/integrations/google/index.ts (3)

1-2156: Check concurrency limits for calendar event ingestion.
Inside insertCalendarEvents (lines ~414-561), multiple events are processed synchronously. If there's a large volume of events, consider parallelizing the insertion where feasible, or add concurrency control to avoid overwhelming the system.

1-2156: Revisit the WebSocket interval approach for progress updates.
At lines ~593-601 and ~848-855, an interval of 4 seconds sends updates to the WebSocket. If ingestion is lengthy, repeated messages could flood the client. Offer an exponential backoff or event-driven approach for fewer but more meaningful updates.

1-2156: Check privacy implications for reading all user PDFs or docs.
In googlePDFsVespa() (lines ~1656-1735) and googleDocsVespa() (lines ~2048-2142), the code processes potentially sensitive user documents. Ensure that either user consent or privacy disclaimers are in place and that content is handled securely if any PII is extracted.

server/migrations/meta/0003_snapshot.json (2)

1-1202: Evaluate foreign key onDelete behavior for references in chats and messages.
By default, the migration sets onDelete: "no action". If you intend to cascade deletions of workspaces or users, consider onDelete: "cascade" to automatically clean up related rows.

1-1202: Assess JSONB usage in columns.
Several tables (like chats.attachments, messages.sources, and sync_history.summary) store data in jsonb. If partial indexing or advanced queries on these fields are needed, ensure GIN indexes are configured, or at least confirm usage patterns.
server/migrations/meta/0002_snapshot.json (2)
470-603: Review of "public.oauth_providers" Table
The "oauth_providers" table contains the necessary fields for managing OAuth settings. One point to note is that the client_secret (lines 516–521) is stored as plain text. Verify that this aligns with your security standards and consider encryption or secure storage mechanisms if necessary.

898-1022: Review of "public.users" Table
The "users" table appropriately captures essential user data. However, note that the deleted_at column (lines 958–963) is defined as not null with a default value of "'1970-01-01T00:00:00Z'". In contrast to other tables that leave "deleted_at" nullable when not deleted, this approach might complicate logic to distinguish active from deleted users. Consider allowing nulls for "deleted_at" to more naturally represent non-deleted records. For example:
-        "deleted_at": {
-          "name": "deleted_at",
-          "type": "timestamp with time zone",
-          "primaryKey": false,
-          "notNull": true,
-          "default": "'1970-01-01T00:00:00Z'"
-        }
+        "deleted_at": {
+          "name": "deleted_at",
+          "type": "timestamp with time zone",
+          "primaryKey": false,
+          "notNull": false,
+          "default": null
+        }

🛑 Comments failed to post (19)

server/ai/mappers.ts (1)
138-139: 💡 Verification agent
🧩 Analysis chain

Verify batch costs for Claude 3.7 Sonnet.

The batch costs for Claude 3.7 Sonnet are set to 0, which is different from other models. Is this intentional? Please verify if batch processing is not supported for this model.

🌐 Web query:
Does Claude 3.7 Sonnet support batch processing?
💡 Result:

Yes, Claude 3.7 Sonnet supports batch processing through Anthropic's Message Batches API. This feature enables asynchronous processing of up to 100,000 requests or 256 MB per batch, offering significant cost savings and efficiency for large-scale operations[1][14].

Key details about batch processing with Claude 3.7 Sonnet:

Cost efficiency: Batch processing costs 50% of standard API pricing, with input/output rates of $1.50/$7.50 per million tokens for Claude 3.7 Sonnet[1][6]

Use cases: Ideal for large-scale evaluations, content moderation, data analysis, and bulk text generation[1][12]

Performance: Most batches complete within 1 hour, with results available for 29 days[1][6]

Extended output: Supports up to 128,000 output tokens in beta mode when paired with batch processing[1][11]

Multi-cloud availability: Accessible through Anthropic's API, AWS Bedrock, and Google Vertex AI[12][14]

The feature supports all capabilities available in standard Messages API requests, including vision, tool use, system messages, and multi-turn conversations[1]. Developers can monitor batch status through API endpoints or Anthropic's console[1][12].

Citations:

1: https://docs.anthropic.com/en/docs/build-with-claude/batch-processing

2: https://www.youtube.com/watch?v=0CrCpXAC2vA

3: https://www.zeniteq.com/blog/anthropics-claude-3-7-sonnet-is-finally-here

4: https://aws.amazon.com/blogs/aws/anthropics-claude-3-7-sonnet-the-first-hybrid-reasoning-model-is-now-available-in-amazon-bedrock/

5: https://openrouter.ai/anthropic/claude-3.7-sonnet

6: https://simonw.substack.com/p/claude-37-sonnet-extended-thinking

7: https://www.datacamp.com/blog/claude-3-7-sonnet

8: https://simonwillison.net/2025/Feb/25/llm-anthropic-014/

9: https://www.aboutamazon.com/news/aws/claude-3-7-sonnet-anthropic-amazon-bedrock

10: https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-anthropic-claude-37.html

11: https://anthropic.com/claude-3-7-sonnet-system-card

12: https://cloud.google.com/blog/products/ai-machine-learning/anthropics-claude-3-7-sonnet-is-available-on-vertex-ai/

13: https://www.axios.com/2025/02/24/anthropic-claude-sonnet-reasoning

14: https://www.anthropic.com/claude/sonnet
ACTION: Update batch cost values for Claude 3.7 Sonnet

The web query confirms that Claude 3.7 Sonnet does support batch processing via Anthropic’s Message Batches API, with batch rates at 50% of standard API pricing (e.g., approximately $1.50 per million input tokens and $7.50 per million output tokens). Given that the values in server/ai/mappers.ts (lines 138-139) are currently set to 0, please update these fields to reflect the correct batch processing costs or add clarifying comments if placeholders are intended.
File: server/ai/mappers.ts
Lines: 138-139
      pricePerThousandInputTokens: 0,
      pricePerThousandOutputTokens: 0,
frontend/src/routes/_authenticated/integrations/slack.tsx (2)
21-22: 🛠️ Refactor suggestion

Add error handling for missing context.

The code directly accesses the last match's context without checking if matches exist or if the context contains the expected properties. This could lead to runtime errors if these assumptions aren't met.
- const matches = useRouterState({ select: (s) => s.matches })
- const { user, workspace } = matches[matches.length - 1].context
+ const matches = useRouterState({ select: (s) => s.matches })
+ if (!matches.length) {
+   // Handle case where no matches exist
+   return <div>Error: Route context not found</div>
+ }
+ const context = matches[matches.length - 1].context
+ if (!context || !context.user || !context.workspace) {
+   // Handle case where context is missing expected properties
+   return <div>Error: User or workspace not found</div>
+ }
+ const { user, workspace } = context
6-7: 🛠️ Refactor suggestion

Remove unnecessary @ts-ignore comment.

The @ts-ignore comment is being used to suppress TypeScript errors without addressing the underlying type issue. This practice can lead to hidden bugs and makes the codebase harder to maintain.
- // @ts-ignore
- const userWorkspace = context
+ const userWorkspace = context as { user: User; workspace: Workspace }
Consider properly typing the context object or extracting only the needed properties to ensure type safety.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
    const userWorkspace = context as { user: User; workspace: Workspace }
server/queue/index.ts (1)
17-17: 🛠️ Refactor suggestion

Add error handling for Slack ingestion similar to other integrations.

The import of handleSlackIngestion is added correctly, but unlike the handleGoogleOAuthChanges used in the SyncOAuthSaaSQueue worker, there is no specific error handling for Slack ingestion in the SaaSQueue worker. Consider implementing similar error handling.

In the worker implementation, consider adding try-catch similar to other handlers:
else if (
  jobData.app === Apps.Slack &&
  jobData.authType === AuthType.OAuth
) {
+  try {
    await handleSlackIngestion(boss, job)
+  } catch (error) {
+    const errorMessage = getErrorMessage(error)
+    Logger.error(
+      error,
+      `Unhandled Error while processing Slack ingestion ${errorMessage} ${(error as Error).stack}`,
+    )
+  }
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
else if (
  jobData.app === Apps.Slack &&
  jobData.authType === AuthType.OAuth
) {
  try {
    await handleSlackIngestion(boss, job)
  } catch (error) {
    const errorMessage = getErrorMessage(error)
    Logger.error(
      error,
      `Unhandled Error while processing Slack ingestion ${errorMessage} ${(error as Error).stack}`,
    )
  }
}
server/integrations/metricStream.ts (3)
3-3: 🛠️ Refactor suggestion

Add specific type parameters to the Map declaration.

The Map is declared without type parameters, which reduces type safety. Specify the key and value types to ensure proper type checking.
-export const wsConnections = new Map()
+export const wsConnections = new Map<string, WSContext>()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
export const wsConnections = new Map<string, WSContext>()
5-7: 🛠️ Refactor suggestion

Add error handling to the closeWs function.

The function attempts to close a WebSocket connection without checking if it exists first. This could lead to runtime errors if the connection doesn't exist.
export const closeWs = (connectorId: string) => {
-  wsConnections.get(connectorId)?.close(1000, "Job finished")
+  const ws = wsConnections.get(connectorId)
+  if (ws) {
+    try {
+      ws.close(1000, "Job finished")
+    } catch (error) {
+      console.error(`Failed to close WebSocket connection for connector ${connectorId}:`, error)
+    } finally {
+      wsConnections.delete(connectorId)
+    }
+  }
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
export const closeWs = (connectorId: string) => {
  const ws = wsConnections.get(connectorId)
  if (ws) {
    try {
      ws.close(1000, "Job finished")
    } catch (error) {
      console.error(`Failed to close WebSocket connection for connector ${connectorId}:`, error)
    } finally {
      wsConnections.delete(connectorId)
    }
  }
}
1-14: 🛠️ Refactor suggestion

Add memory management to prevent memory leaks.

The WebSocket connections are stored in a Map but there's no mechanism to remove closed or stale connections, which could lead to memory leaks. Consider adding a cleanup method and documenting the connection lifecycle.
import type { WSContext } from "hono/ws"

export const wsConnections = new Map<string, WSContext>()

+/**
+ * Registers a new WebSocket connection
+ * @param connectorId Unique identifier for the connection
+ * @param ws WebSocket context to register
+ */
+export const registerWsConnection = (connectorId: string, ws: WSContext): void => {
+  // Close existing connection if present
+  closeWs(connectorId)
+  
+  // Register new connection
+  wsConnections.set(connectorId, ws)
+  
+  // Set up event listener to clean up on close
+  ws.addEventListener('close', () => {
+    wsConnections.delete(connectorId)
+  })
+}

export const closeWs = (connectorId: string) => {
  // Implementation as suggested in previous comment
}

export const sendWebsocketMessage = (message: string, connectorId: string) => {
  // Implementation as suggested in previous comment
}
frontend/src/lib/common.tsx (1)
92-103: 🛠️ Refactor suggestion

Added LoaderContent component and minHeight constant

Moving the LoaderContent component to a common location improves code reusability across the application. This is a good refactoring practice.

However, the template string in the className property on line 96 (min-h-[${minHeight}px]) may not work as expected with Tailwind CSS. Tailwind typically requires hardcoded class names for its JIT compiler to generate the appropriate styles.
-    <div
-      className={`min-h-[${minHeight}px] w-full flex items-center justify-center`}
-    >
+    <div
+      className="min-h-[320px] w-full flex items-center justify-center"
+      style={{ minHeight: `${minHeight}px` }}
+    >
Alternatively, if you're using the Tailwind CSS @apply directive or a CSS-in-JS solution that supports dynamic values, please ensure this pattern works as expected.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
export const minHeight = 320
export const LoaderContent = () => {
  return (
    <div
      className="min-h-[320px] w-full flex items-center justify-center"
      style={{ minHeight: `${minHeight}px` }}
    >
      <div className="items-center justify-center">
        <LoadingSpinner className="mr-2 h-4 w-4 animate-spin" />
      </div>
    </div>
  )
}
frontend/src/components/IntegrationsSidebar.tsx (1)
35-42: 💡 Verification agent
🧩 Analysis chain

Consider adding admin-specific routes for Slack similar to Google integration.

The Google integration has different routes for admin and regular users, but the Slack integration navigates to the same route regardless of user role (with commented-out code suggesting this was initially planned).

🏁 Script executed:
#!/bin/bash
# Check if there are separate admin routes for Slack in the codebase
echo "Checking for Slack admin routes..."
rg -g "*.tsx" -g "*.ts" "/admin/integrations/slack"
Length of output: 1747
Update Slack Navigation Based on User Role

The code currently navigates to the same Slack route for all users, but we have confirmed that a separate admin route exists (e.g., in frontend/src/routes/_authenticated/admin/integrations/slack.tsx). Please update the navigation logic so that admin users (SuperAdmin/Admin) are directed to the admin-specific route while regular users continue to navigate to /integrations/slack. For example, consider replacing the current handler:
-  router.navigate({
-    to:
-      // role === UserRole.SuperAdmin || role === UserRole.Admin
-      "/integrations/slack",
-    // : "/integrations/slack",
-  })
+  router.navigate({
+    to: role === UserRole.SuperAdmin || role === UserRole.Admin
+         ? "/admin/integrations/slack"
+         : "/integrations/slack",
+  })
This aligns the Slack integration behavior with the Google integration's role-specific routing.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
              className={`group flex justify-between items-center ${location.pathname.includes("/integrations/slack") ? "bg-[#EBEFF2]" : ""} hover:bg-[#EBEFF2] rounded-[6px] pt-[8px] pb-[8px] ml-[8px] mr-[8px]`}
              onClick={() => {
                router.navigate({
                  to: role === UserRole.SuperAdmin || role === UserRole.Admin
                       ? "/admin/integrations/slack"
                       : "/integrations/slack",
                })
              }}
server/migrations/0003_jittery_speed.sql (1)
1-10: ⚠️ Potential issue

Potential issue with adding a NOT NULL column without default value.

The migration adds a new connector_id column with NOT NULL constraint but doesn't provide a default value or data migration logic. If the table already contains data, this migration might fail.

Consider modifying the migration to handle existing data:
ALTER TABLE "oauth_providers" DROP CONSTRAINT "oauth_providers_container_id_connectors_id_fk";
--> statement-breakpoint
-ALTER TABLE "oauth_providers" ADD COLUMN "connector_id" integer NOT NULL;
+ALTER TABLE "oauth_providers" ADD COLUMN "connector_id" integer;
+UPDATE "oauth_providers" SET "connector_id" = "container_id" WHERE "container_id" IS NOT NULL;
+ALTER TABLE "oauth_providers" ALTER COLUMN "connector_id" SET NOT NULL;
--> statement-breakpoint
DO $$ BEGIN
 ALTER TABLE "oauth_providers" ADD CONSTRAINT "oauth_providers_connector_id_connectors_id_fk" FOREIGN KEY ("connector_id") REFERENCES "public"."connectors"("id") ON DELETE no action ON UPDATE no action;
EXCEPTION
 WHEN duplicate_object THEN null;
END $$;
--> statement-breakpoint
ALTER TABLE "oauth_providers" DROP COLUMN IF EXISTS "container_id";
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
ALTER TABLE "oauth_providers" DROP CONSTRAINT "oauth_providers_container_id_connectors_id_fk";
--> statement-breakpoint
ALTER TABLE "oauth_providers" ADD COLUMN "connector_id" integer;
UPDATE "oauth_providers" SET "connector_id" = "container_id" WHERE "container_id" IS NOT NULL;
ALTER TABLE "oauth_providers" ALTER COLUMN "connector_id" SET NOT NULL;
--> statement-breakpoint
DO $$ BEGIN
 ALTER TABLE "oauth_providers" ADD CONSTRAINT "oauth_providers_connector_id_connectors_id_fk" FOREIGN KEY ("connector_id") REFERENCES "public"."connectors"("id") ON DELETE no action ON UPDATE no action;
EXCEPTION
 WHEN duplicate_object THEN null;
END $$;
--> statement-breakpoint
ALTER TABLE "oauth_providers" DROP COLUMN IF EXISTS "container_id";
server/api/admin.ts (2)
239-314: 🛠️ Refactor suggestion

Secure API key handling recommendation.

Storing raw API keys in plaintext can pose security risks if the database or logs are compromised. It’s best practice to encrypt or securely vault these keys.

Below is an example diff to integrate basic encryption at insertion time using Node’s crypto module. For a production system, consider a secure vault/KMS solution:
+import crypto from "crypto";

 export const AddApiKeyConnector = async (c: Context) => {
   // ...
   const apiKey = form.apiKey
   // ...

   // Start a transaction
   return await db.transaction(async (trx) => {
     try {
+       const encryptionKey = process.env.API_KEY_ENCRYPTION_SECRET;
+       const encryptedApiKey = crypto
+         .createCipher("aes-256-cbc", encryptionKey)
+         .update(apiKey, "utf8", "hex") + 
+         crypto.createCipher("aes-256-cbc", encryptionKey)
+               .final("hex");

       const connector = await insertConnector(
         trx,
         user.workspaceId,
         user.id,
         user.workspaceExternalId,
         `${app}-${ConnectorType.SaaS}-${AuthType.ApiKey}`,
         ConnectorType.SaaS,
         AuthType.ApiKey,
         app,
         {},
         null,
         null,
         null,
         null,
-        apiKey,
+        encryptedApiKey,
       )
       // ...
Committable suggestion skipped: line range outside the PR's diff.

49-75: 💡 Verification agent
🧩 Analysis chain

Verify Slack OAuth parameters.

While this Slack OAuth code looks structurally sound, the parameter code is typically returned by Slack upon user authorization rather than set by the client. Slack PKCE flow may require sending code_challenge, not code.

Please confirm Slack’s official OAuth docs to ensure that setting the “code” param in the authorization URL is correct. You can verify with:

🌐 Web query:
Is the Slack PKCE flow expecting "code_challenge" or "code" in the initial authorization request?
💡 Result:

The Slack OAuth flow does not use PKCE (Proof Key for Code Exchange) natively in its standard implementation. Instead, the initial authorization request focuses on obtaining an authorization code through traditional OAuth 2.0 parameters. Here's the breakdown:

Key Observations

Standard Slack OAuth Flow:
Slack's documentation specifies that the initial /authorize request includes parameters like client_id, scope, redirect_uri, and state. The authorization server returns a code to the redirect URL[2][4], which is then exchanged for an access token using the client_secret[4][8]. PKCE-specific parameters like code_challenge are not mentioned in Slack’s core OAuth flows.

PKCE in Slack Tools:
While Slack’s developer tools (e.g., Bolt SDK for Python) allow custom authorization flows[5][14], these do not explicitly require PKCE. Third-party integrations (like Adobe Workfront or Arcade Docs[3][10]) demonstrate PKCE workflows, but Slack's native endpoints (e.g., /oauth/authorize, /oauth.access) do not support PKCE's code_challenge or code_verifier parameters[2][4][8].

Flow Comparison

Parameter Slack OAuth PKCE OAuth

Initial Auth Request client_id, scope, redirect_uri, state[2][4] client_id, code_challenge, code_challenge_method[1][6][10]

Token Exchange client_secret, code[4][8] code_verifier (no client_secret)[1][6][10]

Use Case Web/backend apps with a client_secret[4] Mobile/SPA apps without client_secret[1][10]

Conclusion

Slack’s native OAuth flow expects the code parameter in the authorization response[4], not code_challenge in the initial request. For PKCE, a custom implementation or third-party tool (e.g., Arcade[3]) would be required, as Slack’s core API does not natively support PKCE parameters like code_challenge.

Citations:

1: https://developer.squareup.com/forums/t/oauth-now-supports-pkce/6561

2: https://api.slack.com/legacy/oauth

3: https://docs.arcade.dev/home/auth-providers/oauth2

4: https://api.slack.com/authentication/oauth-v2

5: https://tools.slack.dev/bolt-python/concepts/authorization/

6: https://questdb.com/docs/operations/openid-connect-oidc-integration/

7: https://docs.workato.com/developing-connectors/sdk/guides/authentication/oauth/auth-code-pkce.html

8: https://api.slack.com/web

9: https://deepstrike.io/blog/oauth-from-security-perspective-pt-1

10: https://experienceleague.adobe.com/en/docs/workfront/using/adobe-workfront-api/api-notes/oauth-app-pkce-flow

11: https://api.slack.com/authentication/verifying-requests-from-slack

12: https://assets.ctfassets.net/2ntc334xpx65/7D1vaJ4Q908Th0iklEkFWs/d4a6ebeb2814000214da88a224c9b2f4/OAuth_2.0_and_OpenID_Connect__The_Professional_Guide_Feb6.pdf

13: https://api.slack.com/automation/external-auth

14: https://tools.slack.dev/deno-slack-sdk/tutorials/open-authorization

15: https://faun.pub/authorization-grant-code-with-pkce-189bc8d7abb

16: https://stytch.com/blog/authorization-code-flow-with-pkce/

17: https://help.salesforce.com/s/articleView?id=sf.remoteaccess_oauth_web_server_flow.htm&language=en_US&type=5

18: https://api.slack.com/authentication

19: https://help.salesforce.com/s/articleView?id=sf.remoteaccess_oauth_flow_errors.htm&type=5&language=en_US
Attention: Adjust Slack OAuth URL Parameters in server/api/admin.ts

Based on Slack’s official OAuth documentation, Slack’s native flow expects the authorization code to be returned by Slack after user consent rather than being provided by the client in the initial request. In other words, the current setting of the "code" parameter with a locally generated codeVerifier is not appropriate. If a PKCE-like flow were intended, you’d need to use a "code_challenge" parameter instead—but Slack does not natively support PKCE.

Action: Remove or update the line
url.searchParams.set("code", codeVerifier)
in the Slack block so that only the standard parameters (e.g., "client_id", "redirect_uri", "state", and "user_scope") are included in the authorization URL.
server/search/vespaClient.ts (1)
170-170: 💡 Verification agent
🧩 Analysis chain

Ensure all references to the old insertion methods are updated.
Using a single Inserts union type is a solid approach; please verify that any prior calls to specialized insertion methods are replaced with insert() to avoid runtime type mismatches.

🏁 Script executed:
#!/bin/bash
# Scan for old insert usage across the codebase
rg -A 5 'insertDocument\s*\('
rg -A 5 'insert\s*\('
Length of output: 11522
Action: Replace all legacy insertDocument() references with the new unified insert() method

We've scanned the codebase and found that although the new method defined as
async insert(document: Inserts, options: VespaConfigValues): Promise<void> {
is in place, legacy usages of insertDocument() still exist. For example, in addition to its definition in server/search/vespaClient.ts, several files (such as server/search/vespa.ts and multiple integration modules under server/integrations/) continue to call insertDocument(). To ensure consistent type handling with the unified Inserts union type and avoid runtime mismatches, please replace all instances of insertDocument() with the new insert() method.

server/search/vespaClient.ts: Remove or refactor the legacy insertDocument() method.

server/search/vespa.ts and others: Update call sites that use insertDocument() so they invoke insert() instead.
server/api/oauth.ts (1)
66-94: 💡 Verification agent
🧩 Analysis chain

Slack OAuth token exchange.
Implementation follows Slack’s documented flow. Make sure to handle or log partial failures if the Slack response is only partially valid.

🏁 Script executed:
#!/bin/bash
# Confirm Slack responses are validated properly
rg -C 5 'tokenData.authed_user'
Length of output: 859
Action Required: Enhance Validation for Partial Slack Response Failures

While the current implementation follows Slack’s OAuth flow by checking the overall ok flag on the response, it doesn’t explicitly verify that all critical fields (e.g., tokenData.authed_user.id, access_token, token_type, and scope) are present when ok is true. To guard against scenarios where Slack returns a partially valid response, please add additional validations or logging for missing fields. This ensures that potential issues downstream are caught and handled appropriately.

Verify that tokenData.authed_user and its required sub-fields exist before proceeding.

Log a clear warning or error if any expected field is missing, even when tokenData.ok is true.

Update tests to simulate and check for partial response failures.
server/search/mappers.ts (2)
116-116: 🛠️ Refactor suggestion

Remove debug log statement.

There's a console.log statement that appears to be for debugging purposes. This should be removed before pushing to production.
-            console.log(JSON.stringify(child.fields))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
180-180: 🛠️ Refactor suggestion

Remove debug log statement.

This console.log statement should be removed as it's likely for debugging purposes and not needed in production code.
-          console.log(child, JSON.stringify(child.fields))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
frontend/src/routes/_authenticated/admin/integrations/slack.tsx (1)

132-235: 🛠️ Refactor suggestion

Safeguard secrets and client credentials.
The UI currently collects and sends the client secret in plain text. Consider storing these secrets in a secure location (e.g., encrypted at rest) and using an environment variable or a secure vault. Minimizing direct handling in the frontend reduces the chance of accidental leakage.
server/integrations/tracker.ts (1)
167-180: 🛠️ Refactor suggestion

Slack Progress Method Not Yet Implemented.
Currently returns 0 for Slack progression. This is likely a placeholder.
} else if (this.app === Apps.Slack) {
-  return 0
+  // Implement Slack progress logic here or unify with serviceAccountProgress
}
Committable suggestion skipped: line range outside the PR's diff.
server/vespa/schemas/chat_message.sd (1)
114-154: 💡 Verification agent
🧩 Analysis chain

Rank Profile.

Proper first-phase and global-phase definitions.

freshness_score() uses time-based decay—verify that your createdAt is in ms vs. seconds.

🏁 Script executed:
#!/bin/bash
# Confirm that createdAt is in milliseconds. 
# If createdAt is in seconds, dividing by 1000 here may produce incorrect freshness results.
rg "createdAt" -A 5
Length of output: 20337
Issue with Freshness Score Calculation – Unit Mismatch

After verifying the createdAt definitions and uses across the codebase, note the following:
In server/vespa/schemas/chat_message.sd, the createdAt field is declared as a double. In several integrations (e.g. server/integrations/slack/index.ts), the Slack timestamp is parsed with parseFloat(message.ts!), which suggests that these timestamps are in seconds (e.g., 1734442791.514519) rather than milliseconds.
The freshness_score() function uses the expression
exp(-1 * (now() - (attribute(createdAt) / 1000)) / (3600 * 24))
This division by 1000 is appropriate only if createdAt were in milliseconds. If the field already holds seconds, this conversion will drastically miscalculate the freshness score.
The remainder of the rank-profile (first-phase and global-phase definitions) looks correct.
Action Required:
Please confirm the intended unit for createdAt in chat messages:

If the Slack timestamps are meant to remain in seconds, remove the /1000 conversion in the freshness score expression.

Alternatively, if you intended to work with millisecond timestamps across your schema, update your integrations (e.g., multiply Slack’s message.ts by 1000) before storing values.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (8)

server/api/admin.ts (4)

35-35: Ensure usage of scopes is necessary.
Currently, the import import { scopes } from "@/integrations/google/config" is not referenced in subsequent code. Please verify if this is needed or remove it to reduce clutter.

57-57: Initialize the URL variable only when needed.
Consider defining and initializing url closer to its usage to improve readability. This minimizes unnecessary scope exposure.

118-118: Log message might expose sensitive user info.
Ensure that logging sub (potentially an email) doesn’t risk exposing PII. Consider partial masking or safe logging practices.

158-158: Clarify the null parameter.
The null argument might be a placeholder. If you intend to store data here in future, add a brief inline comment explaining the purpose.

server/api/oauth.ts (4)

15-15: Check slackHost config usage.
slackHost is being used to construct OAuth callback URLs. Confirm that it matches the Slack app’s domain and environment settings (dev vs. prod).

24-32: SlackOAuthResp interface.
You’ve introduced a well-defined structure for Slack’s OAuth response. Consider whether you might also need fields like bot_user_id or incoming_webhook (if relevant) for extended usage.

62-62: Mixed token types.
SlackOAuthResp | OAuthCredentials can introduce confusion. Consider creating a unified interface or separate flows for clearer token handling logic.

136-138: Slack success redirect.
Providing a dedicated success route for Slack is fine. Confirm that end users know which integration’s flow they have completed. Alternatively, unify success endpoints to reduce confusion.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2e11048 and 926202e.

📒 Files selected for processing (6)

server/api/admin.ts (5 hunks)
server/api/oauth.ts (4 hunks)
server/config.ts (3 hunks)
server/db/connector.ts (2 hunks)
server/db/oauthProvider.ts (1 hunks)
server/vespa/validation-overrides.xml (1 hunks)

✅ Files skipped from review due to trivial changes (1)

server/vespa/validation-overrides.xml

🚧 Files skipped from review as they are similar to previous changes (3)

server/db/oauthProvider.ts
server/config.ts
server/db/connector.ts

🧰 Additional context used

🧬 Code Definitions (2)

server/api/oauth.ts (4)

server/shared/types.ts (1)

Apps (25-25)

server/types.ts (1)

OAuthCredentials (119-125)

server/utils.ts (1)

IsGoogleApp (200-207)

server/db/connector.ts (1)

updateConnector (238-255)

server/api/admin.ts (6)

server/utils.ts (1)

IsGoogleApp (200-207)

server/shared/types.ts (1)

Apps (25-25)

server/db/user.ts (1)

getUserByEmail (98-107)

server/errors/index.ts (2)

NoUserFound (322-327)

AddServiceConnectionError (330-335)

server/db/connector.ts (1)

insertConnector (29-78)

server/queue/index.ts (2)

boss (23-26)

SaaSQueue (31-31)

🔇 Additional comments (7)

server/api/admin.ts (3)

22-23: Check environment variable usage for slackHost.
The slackHost property is newly introduced. Verify that it's properly set in configuration and environment variables to avoid runtime errors.

Would you like a script to confirm that slackHost is defined in all relevant config files?

62-70: Google OAuth initialization looks good.
The code properly sets up the Google OAuth flow with a code verifier. No immediate issues are apparent.

74-84: Validate Slack OAuth scopes.
We've introduced a Slack OAuth flow. Consider verifying that oauthScopes align with Slack’s actual scopes. Also verify if you need user-level or bot-level scopes so that the requested permissions match your requirements.

server/api/oauth.ts (4)

11-11: Dependencies for Slack and Google.
Ensure that the arctic package version is well-tested for Slack functionality.

Do you want a script to confirm arctic version references in your lockfiles or package manifests?

71-100: Slack token fetch logic.
Beyond tokenData.ok, consider adding more robust handling for unexpected response shapes or network errors. Also, ensure minimal logging of any tokens or secrets.

101-101: Google flow condition check.
Safe separation of Slack vs. Google logic. This approach to selectively handle each OAuth provider is structurally sound.

106-114: Validate Google token acquisition.
Should the token request fail, log as much context as possible without exposing sensitive data. Consider testing or retry logic for intermittent network errors.

coderabbitai · 2025-03-24T10:29:27Z

server/api/admin.ts

+
+// adding first for slack
+// slack is using bot token for the initial ingestion and sync
+// same service will be used for any api key based connector
+export const AddApiKeyConnector = async (c: Context) => {
+  Logger.info("ApiKeyConnector")
+  const { sub, workspaceId } = c.get(JwtPayloadKey)
+  const email = sub
+  const userRes = await getUserByEmail(db, email)
+  if (!userRes || !userRes.length) {
+    throw new NoUserFound({})
+  }
+  const [user] = userRes
+  // @ts-ignore
+  const form: ApiKeyConnector = c.req.valid("form")
+  // const data = await form["service-key"].text()
+  const apiKey = form.apiKey
+  const app = form.app
+
+  // Start a transaction
+  return await db.transaction(async (trx) => {
+    try {
+      // Insert the connection within the transaction
+      const connector = await insertConnector(
+        trx, // Pass the transaction object
+        user.workspaceId,
+        user.id,
+        user.workspaceExternalId,
+        `${app}-${ConnectorType.SaaS}-${AuthType.ApiKey}`,
+        ConnectorType.SaaS,
+        AuthType.ApiKey,
+        app,
+        {},
+        null,
+        null,
+        null,
+        apiKey,
+      )
+
+      const SaasJobPayload: SaaSJob = {
+        connectorId: connector.id,
+        workspaceId: user.workspaceId,
+        userId: user.id,
+        app,
+        externalId: connector.externalId,
+        authType: connector.authType as AuthType,
+        email: sub,
+      }
+      // Enqueue the background job within the same transaction
+      const jobId = await boss.send(SaaSQueue, SaasJobPayload, {
+        singletonKey: connector.externalId,
+        priority: 1,
+        retryLimit: 0,
+      })
+
+      Logger.info(`Job ${jobId} enqueued for connection ${connector.id}`)
+
+      // Commit the transaction if everything is successful
+      return c.json({
+        success: true,
+        message: "Connection created, job enqueued",
+        id: connector.externalId,
+      })
+    } catch (error) {
+      const errMessage = getErrorMessage(error)
+      Logger.error(
+        error,
+        `${new AddServiceConnectionError({ cause: error as Error })} \n : ${errMessage} : ${(error as Error).stack}`,
+      )
+      // Rollback the transaction in case of any error
+      throw new HTTPException(500, {
+        message: "Error creating connection or enqueuing job",
+      })
+    }
+  })
+}


🛠️ Refactor suggestion

AddApiKeyConnector logic.

Double-check secure handling of the incoming API key (encryption or hashing if necessary).

Enforce additional input validation on the API key (length, allowed characters, etc.).

Evaluate concurrency scenarios to avoid duplicate connectors and potential key collisions.

server/api/oauth.ts

coderabbitai

Actionable comments posted: 7

🔭 Outside diff range comments (1)

server/search/vespaClient.ts (1)
172-202: ⚠️ Potential issue

Refactored insert method
Switching from multiple specific types to the Inserts union streamlines document insertion. However, there's a risk when using await response.text() on line 186 and then calling await response.json() on lines 196–197 because the response body can only be read once.
-      const errorBody = await response.text()
-      Logger.error(`Vespa error: ${errorBody}`)
...
-      const data = await response.json()
+      // If you need to log the entire body for error analysis, parse JSON once and handle the message from that object:
+      const errorJson = await response.json()
+      Logger.error(`Vespa error: ${JSON.stringify(errorJson)}`)

🧹 Nitpick comments (32)

server/integrations/tracker.ts (2)
87-146: Streamline the initialization logic for new apps.
The repeated if/else if blocks for Google vs Slack may become unwieldy for future expansions. Consider centralizing each app’s default stat object in a single mapping structure.

Example refactor snippet:
+const defaultStatsByApp = {
+  [Apps.Gmail]: { gmailCount: 0, driveCount: 0, contactsCount: 0, eventsCount: 0, mailAttachmentCount: 0 },
+  [Apps.Slack]: { slackMessageCount: 0, slackConversationCount: 0, slackUserCount: 0, slackMessageReplyCount: 0 },
+  // Add more apps here
+}

  private initializeUserStats(email: string) {
    // ...
-   if (!this.serviceAccountProgress.userStats[email]) {
-     if (IsGoogleApp(this.app)) {
-       this.serviceAccountProgress.userStats[email] = { gmailCount: 0, ... }
-       ...
-     } else if (this.app === Apps.Slack) {
-       this.serviceAccountProgress.userStats[email] = { slackMessageCount: 0, ... }
-       ...
-     }
-   }
+   if (!this.serviceAccountProgress.userStats[email]) {
+     this.serviceAccountProgress.userStats[email] = {
+       ...defaultStatsByApp[this.app],
+       ...baseStats
+     }
+   }

    // Similar approach for this.oAuthProgress.userStats...
  }
210-215: Ensure full consistency of OAuth progress.
setCurrent and setTotal mutate this.oAuthProgress. If other properties (like user) must remain in sync, consider validating or resetting them at the same time. This helps avoid incomplete or inconsistent progress states.
server/vespa/schemas/chat_message.sd (4)

46-49: Consider removing or enabling messageType.
The messageType field is commented out. If no longer needed, removing it would reduce clutter; if needed in the future, consider properly enabling and indexing it now for clarity.

62-65: Revisit the commented-out userRef field.
Similar to messageType, if you anticipate referencing users, enabling the userRef now will help maintain consistent data relationships. Otherwise, removing it might eliminate confusion.

146-147: Check the exponential time decay scale in freshness_score.
Currently, the expression uses a 1-day (24-hour) decay factor. Consider adjusting the factor or making it configurable if you want shorter or longer freshness windows.

Do you need help creating a parameter-driven approach to control freshness decay rates dynamically?

185-195: Evaluate summarization coverage.
Several fields (e.g., reactions) remain commented out in the summary section. Determine whether you want them included for real-time retrieval or keep them excluded to reduce response payload size.

server/search/vespa.ts (2)

50-50: Exported NAMESPACE
Exporting the NAMESPACE constant might be helpful for references in other modules. Consider renaming it to something more descriptive if its usage extends beyond a simplistic placeholder.

510-523: New method GetDocumentWithField
This addition is clear and well-structured. Using an unranked profile is appropriate for simply retrieving documents based on field presence. You might consider wrapping thrown errors in a dedicated custom error class for consistent handling.

server/search/vespaClient.ts (1)

642-690: New getDocumentsWithField method
This approach of matching field existence with matches "." works, but can be expensive for large datasets. To optimize performance, confirm you have adequate pagination and consider filtering to reduce overhead.
server/db/schema.ts (1)
376-383: Consider removing commented code in schema.

The schema contains commented-out fields that appear to be placeholders for future implementation. While this can be helpful during development, it's generally better to add these fields when they're actually needed rather than keeping commented code.
export const googleDriveOAuthIngestionStateSchema = z.object({
  app: z.literal(Apps.GoogleDrive),
  authType: z.literal(AuthType.OAuth),
-  // currentFolderId: z.string().optional(),
-  // lastFileId: z.string().optional(),
-  // lastChangeId: z.string().optional(),
-  // completedFolders: z.array(z.string()),
  lastUpdated: z.string(),
});
server/types.ts (1)
3-3: Unused import from 'bun'.

The connect import from "bun" on line 7 appears to be unused in this file. Consider removing this import to keep the code clean.
import { Apps, AuthType, ConnectorStatus } from "@/shared/types"
import type { PgTransaction } from "drizzle-orm/pg-core"
import type { PostgresJsDatabase } from "drizzle-orm/postgres-js"
import { JWT, type OAuth2Client } from "google-auth-library"
-import { connect } from "bun"
Also applies to: 7-7
server/api/oauth.ts (1)
83-86: Improve error handling for Slack token response.

The current implementation uses a type cast to any and provides limited error context.
- const tokenData = (await response.json()) as any
- if (!tokenData.ok) {
-   throw new Error("Could not get Slack token")
- }
+ const tokenData = await response.json()
+ if (!tokenData.ok) {
+   Logger.error({ error: tokenData.error }, "Failed to retrieve Slack token")
+   throw new Error(`Could not get Slack token: ${tokenData.error || "Unknown error"}`)
+ }
server/api/admin.ts (1)
354-369: Consider using a transaction for delete operations.

Unlike other database operations in this file, the DeleteConnector function doesn't use a transaction, which could lead to inconsistency if additional deletion operations are added later.
export const DeleteConnector = async (c: Context) => {
  const { sub } = c.get(JwtPayloadKey)
  const email = sub
  const userRes = await getUserByEmail(db, email)
  if (!userRes || !userRes.length) {
    throw new NoUserFound({})
  }
  const [user] = userRes
  // @ts-ignore
  const { connectorId }: { connectorId: string } = c.req.valid("form")
- await deleteConnector(db, connectorId, user.id)
- return c.json({
-   success: true,
-   message: "Connector deleted",
- })
+ return await db.transaction(async (trx) => {
+   try {
+     await deleteConnector(trx, connectorId, user.id)
+     return c.json({
+       success: true,
+       message: "Connector deleted",
+     })
+   } catch (error) {
+     Logger.error(error, `Error deleting connector ${connectorId}`)
+     throw new HTTPException(500, { message: "Error deleting connector" })
+   }
+ })
}
server/integrations/ingestionState.ts (2)
13-14: Remove commented code.

There are commented variables that appear to be related to auto-saving functionality. Either implement this functionality or remove the commented code.
- // private saveIntervalMs: number;
- // private timer: NodeJS.Timeout | null = null;
49-84: Add error handling for database operations.

The save and load methods should include error handling to manage database operation failures.
// Persist state to database
async save(): Promise<void> {
  const release = await this.lock.acquire()
  try {
-   await saveConnectorState(
-     this.db,
-     this.connectorId,
-     this.workspaceId,
-     this.userId,
-     this.state,
-   )
+   try {
+     await saveConnectorState(
+       this.db,
+       this.connectorId,
+       this.workspaceId,
+       this.userId,
+       this.state,
+     )
+   } catch (error) {
+     console.error('Failed to save connector state:', error);
+     throw new Error(`Failed to save ingestion state: ${error.message}`);
+   }
  } finally {
    release()
  }
}

// Load state from database
async load(): Promise<void> {
  const release = await this.lock.acquire()
  try {
-   const loadedState = await loadConnectorState<T>(
-     this.db,
-     this.connectorId,
-     this.workspaceId,
-     this.userId,
-   )
+   let loadedState;
+   try {
+     loadedState = await loadConnectorState<T>(
+       this.db,
+       this.connectorId,
+       this.workspaceId,
+       this.userId,
+     )
+   } catch (error) {
+     console.error('Failed to load connector state:', error);
+     throw new Error(`Failed to load ingestion state: ${error.message}`);
+   }
    if (loadedState) {
      this.state = {
        ...this.state,
        ...loadedState,
        lastUpdated: new Date().toISOString(),
      }
    }
  } finally {
    release()
  }
}
server/server.ts (1)
254-254: Remove debugging console.log statement.

This console.log statement appears to be used for debugging and should be removed before merging to production.
-    console.log("here")
server/integrations/slack/index.ts (12)
113-137: Remove commented-out code.

This commented-out function fetchChannelMessages is not used and has been replaced by the insertChannelMessages function implemented later in the file. Commented code should be removed to maintain code cleanliness.
-// const fetchChannelMessages = async (client: WebClient, channelId: string): Promise<SlackMessage[]> => {
-//   let messages: SlackMessage[] = [];
-//   let cursor: string | undefined = undefined;
-//
-//   do {
-//     const response: ConversationsHistoryResponse = await client.conversations.history({
-//       channel: channelId,
-//       limit: 200, // Adjust limit as needed; max is 1000.
-//       cursor,
-//     });
-//
-//     if (!response.ok) {
-//       throw new Error(`Error fetching messages: ${response.error}`);
-//     }
-//
-//     if (response.messages) {
-//       // Type assertion ensures messages are of type SlackMessage[]
-//       messages.push(...(response.messages as SlackMessage[]));
-//     }
-//
-//     cursor = response.response_metadata?.next_cursor;
-//   } while (cursor);
-//
-//   return messages;
-// }
226-353: Complex message processing with potential performance concerns.

The insertChannelMessages function is comprehensive but contains nested API calls for processing thread replies, which could impact performance as noted in the comments. Consider implementing a batch processing approach or Queue-based system for handling thread replies to mitigate API rate limiting issues.

Consider implementing a queue-based system for thread replies processing. This would allow you to:

Decouple the main message ingestion from thread processing

Control the rate of API calls to stay within Slack's limits

Make the ingestion process more resilient to failures

394-398: Replace console.log with Logger.

Console.log statements should be replaced with the Logger instance that's been set up. This ensures consistent logging across the application and better integration with monitoring tools.
-    console.log(`Successfully joined channel: ${channelId}`)
+    Logger.info(`Successfully joined channel: ${channelId}`)
396-397: Improve error handling in joinChannel.

Errors are logged but not propagated, which could hide issues. Consider either propagating the error or adding more detailed logging including the error stack trace.
-    console.error("Error joining channel:", error)
+    Logger.error(error, `Error joining channel ${channelId}: ${getErrorMessage(error)}`)
+    throw new Error(`Failed to join channel ${channelId}: ${getErrorMessage(error)}`)
414-416: Remove @ts-ignore comment.

The @ts-ignore comment suggests a type mismatch. Consider properly typing the conversation object or using a type guard instead of ignoring the type error.
-        // @ts-ignore
-        isIm: (conversation as Channel).is_im!,
+        isIm: 'is_im' in conversation ? conversation.is_im! : false,
454-456: Replace console.error with Logger in getConversationUsers.

For consistency, use the Logger instance for error logging.
-    console.error("Error fetching channel users:", error)
+    Logger.error(error, "Error fetching channel users")
469-471: Improve permissions handling logic.

The permissions array manipulation can be simplified and made more readable.
-  if (!permissions.length || permissions.indexOf(email) == -1) {
-    permissions = permissions.concat(email)
-  }
+  if (!permissions.includes(email)) {
+    permissions = [...permissions, email]
+  }
579-583: Replace console.log with Logger.

Use the Logger instance for consistency.
-    console.log(
-      "total members outside workspace in this channel",
-      members.length,
-    )
+    Logger.info(`Total members outside workspace in this channel: ${members.length}`)
783-786: Replace console.log with Logger for skipped conversations.

Use the Logger instance for consistency.
-          console.log("skipping conversation", conversation)
+          Logger.info(`Skipping conversation: ${JSON.stringify(conversation)}`)
843-845: Replace console.log with Logger for skipped conversations (resumption path).

Similar to the previous comment, use the Logger instance for consistency in the resumption path.
-          console.log("skipping conversation", conversation)
+          Logger.info(`Skipping conversation during resume: ${JSON.stringify(conversation)}`)
883-885: Remove commented-out test code.

The commented-out function call appears to be test code and should be removed before merging.
-// handleSlackIngestion({
-//   data: { connectorId: 8, email: "[email protected]" },
-// })
925-927: Replace console.error with Logger in retryOnFatal.

For consistency, use the Logger instance for error logging.
-        console.error(
-          `Fatal error on attempt ${attempt}: ${(error as Error).message}. Retrying...`,
-        )
+        Logger.error(
+          error,
+          `Fatal error on attempt ${attempt}/${maxAttempts}. Retrying...`
+        )
frontend/src/routes/_authenticated/admin/integrations/slack.tsx (4)
237-237: Fix input ID in client secret field.

The client secret input has the same ID as the client ID input, which could cause issues with form labels. Update the ID to be unique.
-              id="clientId"
+              id="clientSecret"
241-267: Improve form field error display.

The form validates fields but does not actually display error messages to the user. There are conditional blocks that check for errors but always render null. Consider displaying the validation errors.
            {field.state.meta.isTouched && field.state.meta.errors.length
-              ? null
-              : null}
+              ? <p className="text-red-500 text-sm mt-1">{field.state.meta.errors[0]}</p>
+              : null}
646-651: Commented-out code in CardHeader.

There are commented-out CardTitle and CardDescription components. Either restore them or remove the comments entirely for code cleanliness.
-                  {/* <CardTitle>Slack Bot</CardTitle>
-                  <CardDescription>
-                    Add your Slack Bot Token to enable integration.
-                  </CardDescription> */}
+                  <CardTitle>Slack Bot</CardTitle>
+                  <CardDescription>
+                    Add your Slack Bot Token to enable integration.
+                  </CardDescription>
704-709: Commented-out role-based redirect.

There's commented-out code that would redirect admins to a different page. Consider either removing it entirely or implementing it properly if it's a desired feature.

Either remove this commented block completely, or implement the role-based redirection if it's needed:
-    // if (
-    //   userWorkspace?.user?.role === UserRole.SuperAdmin ||
-    //   userWorkspace?.user?.role === UserRole.Admin
-    // ) {
-    //   throw redirect({ to: '/admin/integrations/' })
-    // }

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 926202e and 2027075.

📒 Files selected for processing (26)

frontend/src/routes/_authenticated/admin/integrations/slack.tsx (1 hunks)
frontend/src/types.ts (1 hunks)
server/api/admin.ts (7 hunks)
server/api/oauth.ts (2 hunks)
server/db/connector.ts (6 hunks)
server/db/schema.ts (4 hunks)
server/integrations/abortManager.ts (1 hunks)
server/integrations/google/index.ts (19 hunks)
server/integrations/ingestionState.ts (1 hunks)
server/integrations/metricStream.ts (1 hunks)
server/integrations/slack/index.ts (1 hunks)
server/integrations/tracker.ts (1 hunks)
server/migrations/0004_perpetual_shard.sql (1 hunks)
server/migrations/0005_empty_gertrude_yorkes.sql (1 hunks)
server/migrations/meta/0004_snapshot.json (1 hunks)
server/migrations/meta/0005_snapshot.json (1 hunks)
server/migrations/meta/_journal.json (1 hunks)
server/package.json (2 hunks)
server/queue/index.ts (2 hunks)
server/search/vespa.ts (10 hunks)
server/search/vespaClient.ts (5 hunks)
server/server.ts (5 hunks)
server/shared/types.ts (5 hunks)
server/types.ts (2 hunks)
server/vespa/schemas/chat_message.sd (1 hunks)
server/vespa/services.xml (1 hunks)

✅ Files skipped from review due to trivial changes (4)

server/migrations/0004_perpetual_shard.sql
server/integrations/abortManager.ts
server/migrations/meta/0005_snapshot.json
server/migrations/meta/0004_snapshot.json

🚧 Files skipped from review as they are similar to previous changes (7)

server/vespa/services.xml
server/integrations/metricStream.ts
frontend/src/types.ts
server/package.json
server/queue/index.ts
server/db/connector.ts
server/integrations/google/index.ts

🧰 Additional context used

🧬 Code Definitions (10)

server/types.ts (2)

server/shared/types.ts (1)

Apps (25-25)

server/db/oauthProvider.ts (1)

createOAuthProvider (14-34)

server/shared/types.ts (1)

server/search/types.ts (3)

chatUserSchema (18-18)

entitySchema (137-145)

ChatMessageResponseSchema (862-885)

server/server.ts (2)

server/types.ts (3)

addApiKeyConnectorSchema (76-79)

updateConnectorStatusSchema (94-97)

deleteConnectorSchema (90-92)

server/api/admin.ts (3)

AddApiKeyConnector (258-329)

UpdateConnectorStatus (331-352)

DeleteConnector (354-369)

server/api/admin.ts (6)

server/logger/index.ts (1)

getLogger (25-44)

server/db/user.ts (1)

getUserByEmail (98-107)

server/shared/types.ts (1)

Apps (25-25)

server/errors/index.ts (1)

NoUserFound (322-327)

server/db/connector.ts (2)

updateConnector (245-263)

deleteConnector (265-274)

server/queue/index.ts (1)

SaaSQueue (31-31)

server/search/vespa.ts (3)

server/search/vespaClient.ts (1)

insert (172-212)

server/search/types.ts (4)

Inserts (647-657)

VespaSchema (24-34)

chatUserSchema (18-18)

chatMessageSchema (17-17)

server/utils.ts (1)

getErrorMessage (69-72)

server/db/schema.ts (3)

server/utils/encryption.ts (1)

Encryption (4-61)

server/db/customType.ts (1)

encryptedText (8-36)

server/shared/types.ts (1)

Apps (25-25)

frontend/src/routes/_authenticated/admin/integrations/slack.tsx (3)

server/db/connector.ts (2)

deleteConnector (265-274)

getConnectors (83-101)

server/shared/types.ts (3)

Apps (25-25)

PublicUser (39-39)

PublicWorkspace (41-41)

server/utils.ts (1)

getErrorMessage (69-72)

server/api/oauth.ts (10)

server/types.ts (2)

OAuthCredentials (129-135)

SaaSOAuthJob (125-125)

server/db/user.ts (1)

getUserByEmail (98-107)

server/errors/index.ts (1)

NoUserFound (322-327)

server/db/oauthProvider.ts (1)

getOAuthProvider (36-51)

server/shared/types.ts (1)

Apps (25-25)

server/utils.ts (1)

IsGoogleApp (200-207)

server/db/connector.ts (1)

updateConnector (245-263)

server/integrations/abortManager.ts (1)

globalAbortControllers (4-4)

server/integrations/slack/index.ts (1)

handleSlackIngestion (631-881)

server/queue/index.ts (1)

boss (23-26)

server/search/vespaClient.ts (2)

server/search/types.ts (2)

Inserts (647-657)

VespaSearchResponse (626-626)

server/utils.ts (1)

getErrorMessage (69-72)

server/integrations/tracker.ts (2)

server/shared/types.ts (1)

Apps (25-25)

server/utils.ts (1)

IsGoogleApp (200-207)

🔇 Additional comments (49)

server/integrations/tracker.ts (1)

57-61:
❓ Verification inconclusive

Consider concurrency implications for the global tracker object.
If multiple threads or processes might write to serviceAccountTracker concurrently (e.g., in a server environment), the updates to completedUsers and userStats could become inconsistent. Ensure adequate synchronization or segregation of this object per request/session if concurrency is expected.

Would you like a script to scan for references to serviceAccountTracker in the codebase to confirm whether concurrency handling is addressed elsewhere?

🏁 Script executed:
#!/bin/bash
# This script searches for references to 'serviceAccountTracker' outside tracker.ts to see how it's being used.
rg -A 5 'serviceAccountTracker'
Length of output: 679
Attention: Validate Concurrency Handling for serviceAccountTracker

After reviewing the codebase, the serviceAccountTracker is defined in server/integrations/tracker.ts (lines 57‑61) and the only other mention found—in server/integrations/google/index.ts—is in commented-out code. This indicates that, at present, there’s no active concurrent access pattern observed.

Action Required: Please verify whether concurrent modifications are anticipated in future usage (e.g., through uncommented code or in a multi-threaded server scenario). If so, consider adding appropriate synchronization (or using a per-request/session instance) to prevent potential race conditions.
server/vespa/schemas/chat_message.sd (4)

1-2: Solid schema naming.
The top-level schema name chat_message is concise and descriptive, aligning with the domain context. No issues here.

127-132:
❓ Verification inconclusive

Confirm usage of query parameters in rank profiles.
You have defined query(e) and query(alpha), but they are not used within the rank profile expressions. Confirm whether these are placeholders for future expansions or remove them for clarity.

🏁 Script executed:
#!/bin/bash
# Searching for references to query(e) and query(alpha) usage in rank expressions or code
rg -A 5 'query\$e\$' 
rg -A 5 'query\$alpha\$'
Length of output: 53
Action: Confirm Intentionality for Unused Query Parameters in Rank Profile

Based on an initial search across the codebase, there are no references to query(e) or query(alpha) being used in any rank expressions or elsewhere. Please verify whether these parameters are intended as placeholders for future functionality or if they should be removed for clarity.

The search using rg returned no additional usage of these query parameters in the repository.

Kindly re-check their intended role within the rank profile to ensure they’re not accidentally omitted from expressions.

153-160:
❓ Verification inconclusive

Review weighting in global-phase expression.
The factor * 5 on nativeRank(text, username, name) can disproportionately skew results toward textual fields. Ensure that this weighting aligns with your ranking strategy.

🏁 Script executed:
#!/bin/bash
# Searching project to see if there's any explicit mention or doc on global-phase weighting
rg 'nativeRank\(' -A 3
Length of output: 262
Please verify the intent behind the global-phase multiplier.

In file server/vespa/schemas/chat_message.sd (lines 153–160), the expression
global-phase {
  expression {
    (
      nativeRank(text, username, name) * 5
    )
  }
  rerank-count: 1000
}
multiplies the nativeRank(text, username, name) result by 5. This factor may disproportionately emphasize text-based fields, so please double-check that this weighting aligns with your overall ranking strategy. If the multiplier is intentional to boost text relevance, consider adding a comment for clarity. Otherwise, adjust the multiplier as needed.

116-121:
✅ Verification successful

Verify bfloat16 support for text embeddings.
Ensure Vespa indexing supports bfloat16 for your use case. Some versions and setups may only support float16 or float32.

🌐 Web query:
Is bfloat16 indexing field supported in Vespa for text embeddings?
💡 Result:

Yes, Vespa supports using bfloat16 for indexing text embedding fields. This implementation offers significant memory and storage savings while maintaining retrieval quality:

Key Implementation Details
50% memory reduction: bfloat16 uses 2 bytes per dimension vs. 4 bytes for float
Schema configuration: Embedding fields can be declared using:
field embedding type tensor<bfloat16>(x[384]) {
    indexing: input text | embed | attribute
    attribute {
        distance-metric: prenormalized-angular
    }
}
Minimal accuracy loss: Evaluations show nearly identical retrieval performance between float and bfloat16[1][4]:

Model NDCG@10 (float) NDCG@10 (bfloat16)

bge-small-en 0.7395 0.7346

bge-base-en 0.7662 0.7656
Operational Characteristics

Automatic casting: Vespa converts bfloat16 to float at query time for computation, with a 20-30% performance overhead compared to native float usage[1][4]

Storage optimization: Enables storing 2x more embeddings per node compared to float[1]

Wide adoption: Used in production applications for multi-modal embeddings (text+image) and ColBERT implementations[6][7]

The bfloat16 support is particularly valuable for large-scale deployments where memory efficiency is crucial, as it maintains 99%+ of retrieval effectiveness while halving storage requirements[1][4].

Citations:

1: https://blog.vespa.ai/bge-embedding-models-in-vespa-using-bfloat16/

2: https://huggingface.co/spaces/vespa-engine/colpali-vespa-visual-retrieval/blame/51cbf12cbaafee2dea457e8aa46918be15c0b554/prepare_feed_deploy.py

3: https://www.restack.io/p/vespa-vector-answer-cat-ai

4: https://docs.vespa.ai/en/embedding.html

5: https://www.restack.io/p/embeddings-knowledge-answer-embedding-techniques-json-cat-ai

6: https://pyvespa.readthedocs.io/en/latest/examples/chat_with_your_pdfs_using_colbert_langchain_and_Vespa-cloud.html

7: https://pyvespa.readthedocs.io/en/latest/examples/colpali-document-retrieval-vision-language-models-cloud.html

8: https://github.com/vespa-engine/pyvespa/blob/master/docs/sphinx/source/examples/mother-of-all-embedding-models-cloud.ipynb

9: Ability to index (hnsw) multiple vector points per document vespa-engine/vespa#15854

10: https://colab.research.google.com/github/vespa-engine/pyvespa/blob/master/docs/sphinx/source/examples/scaling-personal-ai-assistants-with-streaming-mode-cloud.ipynb
Bfloat16 for Text Embeddings Confirmed

The Vespa documentation confirms that using tensor<bfloat16> for text embedding fields is supported. The schema configuration in your file is valid, and the current use of the angular distance metric is acceptable. If you want to explore potential optimizations, you might also consider evaluating prenormalized-angular as seen in some examples, but it isn’t a mandatory change.
server/search/vespa.ts (4)

12-13: Add chat schemas
The new imports for chatUserSchema and chatMessageSchema look consistent with adding chat-based data sources.

28-30: Additional chat container & union type
Adding VespaChatContainer, Inserts, and VespaSearchResults broadens the union and search result types, which aligns with the new chat-based usage. Ensure that these are integrated consistently across the codebase.

89-89: Generic insert method
Replacing the union parameter with the Inserts type is a neat approach that unifies document handling. This change simplifies the insertion logic.

143-144: Add chat schemas to AllSources
Including both chatUserSchema and chatMessageSchema within AllSources is consistent with the new chat functionalities, ensuring these schemas are discoverable in queries.

server/search/vespaClient.ts (1)

15-16: New imports for chat
Adding VespaChatContainer and Inserts enables handling chat-related inserts within the Vespa client. This aligns well with the union type in server/search/vespa.ts.

server/shared/types.ts (4)

15-16: Import and export additions look good.

The new imports for chatUserSchema and ChatMessageResponseSchema along with the export of SlackEntity properly support the Slack integration functionality.

Also applies to: 24-24

62-62: ConnectorStatus enum updated with Paused state.

The addition of the Paused status to the ConnectorStatus enum is a good enhancement that allows for better lifecycle management of connectors. This change aligns with the migration file that adds this value to the database enum.

151-162: AutocompleteChatUserSchema is well-structured.

The schema follows the pattern of other autocomplete schemas while properly addressing chat user specifics. Note that both name and email are optional here, whereas email is required in AutocompleteUserSchema. This seems intentional for Slack integration where users might not always have emails available.

171-171: Good integration of chat user autocomplete into the system.

The new chat user schema is correctly integrated into the existing type system:

Adding to the discriminated union in AutocompleteSchema

Creating a proper type definition for ChatUserAutocomplete

Including ChatMessageResponseSchema in the search results

This enables a cohesive chat user experience in the autocomplete and search functionality.

Also applies to: 186-186, 255-255

server/db/schema.ts (6)

40-40: New encryption instance for API keys.

Creating a dedicated apiKeyEncryption instance is a good security practice, ensuring proper encryption of sensitive API key data.

146-146: API key field added to connectors table.

The apiKey field is properly encrypted using the dedicated encryption instance, ensuring secure storage of API keys in the database.

154-157: Well-documented state field for connectors table.

The addition of the state field with clear comments explaining its purpose is excellent. This field will be useful for storing the state needed to resume/restart integrations after crashes, improving the reliability of the system.

182-182: Field name corrected for consistency.

Renamed from container_id to connector_id for better naming consistency. This improves code clarity and maintainability.

367-389: Well-structured ingestion state schemas.

The new schemas for ingestion states are well-defined:

slackOAuthIngestionStateSchema contains fields specific to Slack integration

googleDriveOAuthIngestionStateSchema is prepared for Google Drive integration

ingestionStateSchema creates a proper discriminated union for type safety

This provides a solid foundation for managing integration state across different providers.

394-394: Type definitions for new schema.

The updated selectConnectorSchema correctly incorporates the new state field, and the new type definitions for ingestion states are properly inferred from their schemas, ensuring type safety.

Also applies to: 397-398

server/migrations/0005_empty_gertrude_yorkes.sql (1)

1-1: SQL migration for the new Paused status.

This migration correctly adds the 'paused' value to the "status" type, which aligns with the Paused status added to the ConnectorStatus enum in the TypeScript code. This ensures database and code consistency.

server/types.ts (2)

76-82: Well-defined API key connector schema.

The new addApiKeyConnectorSchema is well-structured with the required fields for creating an API key connector, and the corresponding type is properly defined.

90-97: New schemas for connector management.

The deleteConnectorSchema and updateConnectorStatusSchema provide well-defined structures for managing connectors, including their status. These will help ensure proper validation of inputs for these operations.

server/migrations/meta/_journal.json (1)

18-45: LGTM: Migration entries added in sequence.

The new migration entries follow the correct sequential pattern with incrementing indices and timestamps, which is important for database schema version tracking.

server/api/oauth.ts (3)

69-97: LGTM: Slack OAuth implementation added.

The Slack OAuth token retrieval implementation follows best practices by using proper content types and URL parameters. The response parsing extracts necessary token data into a structured format.

119-129: LGTM: Slack-specific ingestion handling.

The implementation properly sets up an abort controller for the Slack ingestion operation, which is a good practice for managing long-running operations.

145-147: LGTM: Slack-specific redirect URL.

The redirect URL is correctly set based on the app type, ensuring users are directed to the appropriate success page after authentication.

server/api/admin.ts (2)

76-85: LGTM: Slack OAuth URL construction.

The implementation correctly sets up the OAuth URL for Slack with the necessary parameters including client ID, redirect URI, state, and scopes.

331-352: LGTM: UpdateConnectorStatus implementation.

The implementation correctly validates the connector existence before updating its status and returns a meaningful success response.

server/integrations/ingestionState.ts (2)

6-29: LGTM: IngestionState class implementation with mutex locking.

The IngestionState class is well-designed with a generic type parameter and mutex locking to prevent concurrent modifications.

35-47: LGTM: Thread-safe state update method.

The update method correctly uses mutex locking to ensure thread safety when modifying the state.

server/server.ts (6)

17-25: Imports have been properly organized for new schemas.

The import statements have been updated to include new schemas for API key connectors, connector deletion, and status updates, which aligns well with the implementation in other files.

27-34: API functions correctly imported for the new connector operations.

The import statements properly include the new functions AddApiKeyConnector, DeleteConnector, and UpdateConnectorStatus that will be used by the new API endpoints.

50-51: OAuth imports properly restructured.

Updated imports for OAuth callback and cookie utilities, which supports the new integration functionality.

66-66: WebSocket import path has been refactored.

The import for wsConnections has been changed from a Google-specific path to a more generic metricStream path, which is a good architectural improvement that supports multiple integrations including Slack.

187-191: New API endpoint for creating API key connectors.

This endpoint correctly uses the zValidator middleware to validate requests against the defined schema and calls the appropriate handler function.

193-202: New endpoints for connector management.

Added endpoints for updating connector status and deleting connectors, both using appropriate validation schemas and handler functions.

server/integrations/slack/index.ts (4)

1-41: Clean import organization and logging setup.

The imports are well organized by category (DB, search types, Slack API), and a dedicated logger instance is created for the Slack integration module, which is good practice.

51-65: Well-implemented user fetching with pagination.

The getAllUsers function correctly handles pagination using cursors and includes proper error handling for failed API requests.

139-156: Good implementation of retry mechanism for API calls.

The safeConversationReplies function uses the retryOnFatal helper to handle potential API failures, which is essential for resilience when dealing with external APIs that might have rate limits.

183-197: Well-designed mention replacement function.

The replaceMentionsIfPresent function efficiently handles user mentions in messages by checking if mentions exist before attempting replacements and properly resets the regex state, which avoids potential issues with regex stateful behavior.

frontend/src/routes/_authenticated/admin/integrations/slack.tsx (8)

40-54: Well-implemented connector status update function.

The function properly handles errors, including unauthorized access, and provides appropriate user feedback through toast notifications.

57-70: Well-implemented connector deletion function.

Similar to the status update function, this properly handles errors and provides appropriate user feedback.

72-93: Good implementation of botToken submission.

The function correctly submits the token to the backend, handles errors appropriately, and redirects to the auth page if unauthorized.

107-131: Well-implemented OAuth button component.

This component properly handles the OAuth flow and provides appropriate error feedback.

169-272: Comprehensive form validation in SlackOAuthForm.

The form properly validates inputs and handles submission with appropriate error handling.

490-505: Good use of error handling in API query.

The query function properly handles unauthorized errors by redirecting to the auth page, which is a good security practice.

510-527: Well-structured status management logic.

The effect hook properly updates the integration status based on the connector state, covering all possible status values.

533-579: Well-implemented WebSocket connection.

The WebSocket connection is properly set up and managed, with appropriate event listeners for open, message, close, and error events, and proper cleanup in the effect's return function.

coderabbitai · 2025-03-28T03:04:55Z

server/integrations/tracker.ts

+  updateUserStats(email: string, type: StatType, count: number) {
+    this.initializeUserStats(email)
+
+    const serviceStats = this.serviceAccountProgress.userStats[email]
+    const oAuthStats = this.oAuthProgress.userStats[email]
+
+    // Update only if the stat type exists in the stats object
+    if (type in serviceStats) {
+      ;(serviceStats as any)[type] += count
+    }
+    if (type in oAuthStats) {
+      ;(oAuthStats as any)[type] += count
+    }
+  }


🛠️ Refactor suggestion

Refine the typing to remove "any" casting.
Using (serviceStats as any)[type] and (oAuthStats as any)[type] can lead to runtime type mismatches. Consider a typed approach (e.g., using type-mapped fields for known stat properties).

coderabbitai · 2025-03-28T03:04:56Z

server/integrations/tracker.ts

+  getProgress(): number {
+    if (IsGoogleApp(this.app)) {
+      if(this.authType === AuthType.ServiceAccount) {
+      return Math.floor(
+        (this.serviceAccountProgress.completedUsers /
+          this.serviceAccountProgress.totalUsers) *
+          100,
+      )
+      } else {
+        return 0
+      }
+    } else if (this.app === Apps.Slack) {
+      return Math.floor(this.oAuthProgress.current/this.oAuthProgress.total)
+      // return Math.floor(this.oAuthProgress.userStats[this.oAuthProgress.user].slackConversationCount/)
+    } else {
+      throw new Error("Invalid app for progress")
+    }
+  }


⚠️ Potential issue

Handle potential division by zero in Slack progress calculation.
At line 187, this.oAuthProgress.total could be 0, causing a runtime error. Include a guard condition or default value to avoid division by zero.

Proposed fix:

- return Math.floor(this.oAuthProgress.current/this.oAuthProgress.total) + return this.oAuthProgress.total > 0 + ? Math.floor(this.oAuthProgress.current / this.oAuthProgress.total) + : 0

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

getProgress(): number {

if (IsGoogleApp(this.app)) {

if(this.authType === AuthType.ServiceAccount) {

return Math.floor(

(this.serviceAccountProgress.completedUsers /

this.serviceAccountProgress.totalUsers) *

100,

)

} else {

return 0

}

} else if (this.app === Apps.Slack) {

return Math.floor(this.oAuthProgress.current/this.oAuthProgress.total)

// return Math.floor(this.oAuthProgress.userStats[this.oAuthProgress.user].slackConversationCount/)

} else {

throw new Error("Invalid app for progress")

}

}

getProgress(): number {

if (IsGoogleApp(this.app)) {

if(this.authType === AuthType.ServiceAccount) {

return Math.floor(

(this.serviceAccountProgress.completedUsers /

this.serviceAccountProgress.totalUsers) *

100,

)

} else {

return 0

}

} else if (this.app === Apps.Slack) {

return this.oAuthProgress.total > 0

? Math.floor(this.oAuthProgress.current / this.oAuthProgress.total)

: 0

// return Math.floor(this.oAuthProgress.userStats[this.oAuthProgress.user].slackConversationCount/)

} else {

throw new Error("Invalid app for progress")

}

}

coderabbitai · 2025-03-28T03:04:56Z

server/search/vespa.ts

+        or
+        (name_fuzzy contains ({maxEditDistance: 2, prefix: true} fuzzy(@query)) or email_fuzzy contains ({maxEditDistance: 2, prefix: true} fuzzy(@query))
+        and permissions contains @email
+        )


🛠️ Refactor suggestion

Verify operator precedence
When using or and and together in your YQL condition, confirm the intended logic is preserved. Consider additional parentheses around (name_fuzzy ... or email_fuzzy ...) before the and permissions contains @email clause for clarity.

coderabbitai · 2025-03-28T03:04:56Z

server/api/oauth.ts

-    )
-    const oauthTokens = tokens as OAuthCredentials
-    oauthTokens.data.accessTokenExpiresAt = tokens.accessTokenExpiresAt()
+    console.log("provider", provider)


🛠️ Refactor suggestion

Remove console.log statement exposing sensitive provider data.

This logs potentially sensitive information about the OAuth provider including client secrets.

- console.log("provider", provider) + Logger.debug("OAuth provider details retrieved successfully")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

console.log("provider", provider)

Logger.debug("OAuth provider details retrieved successfully")

coderabbitai · 2025-03-28T03:04:56Z

server/api/admin.ts

+        singletonKey: connector.externalId,
+        priority: 1,
+        retryLimit: 0,
+      })


🛠️ Refactor suggestion

Add job expiry configuration for API key connector jobs.

The API key connector jobs are missing the expireInHours parameter that's present in other job configurations, which could lead to lingering jobs.

const jobId = await boss.send(SaaSQueue, SaasJobPayload, { singletonKey: connector.externalId, priority: 1, retryLimit: 0, + expireInHours: JobExpiryHours, })

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

singletonKey: connector.externalId,

priority: 1,

retryLimit: 0,

})

const jobId = await boss.send(SaaSQueue, SaasJobPayload, {

singletonKey: connector.externalId,

priority: 1,

retryLimit: 0,

expireInHours: JobExpiryHours,

})

coderabbitai · 2025-03-28T03:04:56Z

server/api/admin.ts

+// adding first for slack
+// slack is using bot token for the initial ingestion and sync
+// same service will be used for any api key based connector
+export const AddApiKeyConnector = async (c: Context) => {
+  Logger.info("ApiKeyConnector")
+  const { sub, workspaceId } = c.get(JwtPayloadKey)
+  const email = sub
+  const userRes = await getUserByEmail(db, email)
+  if (!userRes || !userRes.length) {
+    throw new NoUserFound({})
+  }
+  const [user] = userRes
+  // @ts-ignore
+  const form: ApiKeyConnector = c.req.valid("form")
+  // const data = await form["service-key"].text()
+  const apiKey = form.apiKey
+  const app = form.app
+
+  // Start a transaction
+  return await db.transaction(async (trx) => {
+    try {
+      // Insert the connection within the transaction
+      const connector = await insertConnector(
+        trx, // Pass the transaction object
+        user.workspaceId,
+        user.id,
+        user.workspaceExternalId,
+        `${app}-${ConnectorType.SaaS}-${AuthType.ApiKey}`,
+        ConnectorType.SaaS,
+        AuthType.ApiKey,
+        app,
+        {},
+        null,
+        null,
+        null,
+        apiKey,
+      )
+
+      const SaasJobPayload: SaaSJob = {
+        connectorId: connector.id,
+        workspaceId: user.workspaceId,
+        userId: user.id,
+        app,
+        externalId: connector.externalId,
+        authType: connector.authType as AuthType,
+        email: sub,
+      }
+      // Enqueue the background job within the same transaction
+      const jobId = await boss.send(SaaSQueue, SaasJobPayload, {
+        singletonKey: connector.externalId,
+        priority: 1,
+        retryLimit: 0,
+      })
+
+      Logger.info(`Job ${jobId} enqueued for connection ${connector.id}`)
+
+      // Commit the transaction if everything is successful
+      return c.json({
+        success: true,
+        message: "Connection created, job enqueued",
+        id: connector.externalId,
+      })
+    } catch (error) {
+      const errMessage = getErrorMessage(error)
+      Logger.error(
+        error,
+        `${new AddServiceConnectionError({ cause: error as Error })} \n : ${errMessage} : ${(error as Error).stack}`,
+      )
+      // Rollback the transaction in case of any error
+      throw new HTTPException(500, {
+        message: "Error creating connection or enqueuing job",
+      })
+    }
+  })
+}


💡 Verification agent

🧩 Analysis chain

Enhance security in the API key connector implementation.

The current implementation lacks proper validation and secure handling of the API key.

Add validation for the API key format

Ensure the API key is securely stored (potentially encrypted)

Add rate limiting for API key submissions

🏁 Script executed:

#!/bin/bash # Check for any API key handling best practices in the codebase # Look for encryption/secure storage patterns rg -A 3 "encrypt|secureStore|vault" --type ts # Check for validation patterns rg -A 3 "validateApiKey|isValidKey" --type ts # Look for any other API key handling in the codebase rg -A 3 "apiKey(?!.*=)" --type ts

Length of output: 5111

Enhance API key security in the connector API

After further review, note that while encryption is applied at the database layer (evident in server/db/schema.ts via the encryptedText wrapper using the Encryption utility), the API endpoint itself lacks direct API key format validation and rate limiting. Please address the following:

API key validation: Add a validation check to ensure the API key meets the expected format before processing.

Rate limiting: Implement rate limiting for API key submissions to guard against abuse.

Security review: Confirm that the API key is not only encrypted during storage but also securely handled earlier in the API call flow if possible.

coderabbitai · 2025-03-28T03:04:56Z

server/integrations/slack/index.ts

+function isFatalError(error: any): boolean {
+  // For example, if your error has a code or property indicating it’s fatal,
+  // you can check for that. Here, we assume every error is fatal.
+  // Modify this as needed.
+  return true
+}


🛠️ Refactor suggestion

Improve isFatalError implementation.

The current implementation treats all errors as fatal, which might not be ideal. Consider implementing more granular error classification based on Slack API error codes.

function isFatalError(error: any): boolean { - // For example, if your error has a code or property indicating it's fatal, - // you can check for that. Here, we assume every error is fatal. - // Modify this as needed. - return true + // Non-fatal errors (temporary issues like rate limits) + const nonFatalErrorCodes = ['ratelimited', 'service_unavailable', 'timeout']; + + if (error?.response?.data?.error && nonFatalErrorCodes.includes(error.response.data.error)) { + return false; + } + + // Consider network errors as potentially recoverable + if (error.code === 'ECONNRESET' || error.code === 'ETIMEDOUT') { + return false; + } + + // All other errors are considered fatal + return true }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

function isFatalError(error: any): boolean {

// For example, if your error has a code or property indicating it’s fatal,

// you can check for that. Here, we assume every error is fatal.

// Modify this as needed.

return true

}

function isFatalError(error: any): boolean {

// Non-fatal errors (temporary issues like rate limits)

const nonFatalErrorCodes = ['ratelimited', 'service_unavailable', 'timeout'];

if (error?.response?.data?.error && nonFatalErrorCodes.includes(error.response.data.error)) {

return false;

}

// Consider network errors as potentially recoverable

if (error.code === 'ECONNRESET' || error.code === 'ETIMEDOUT') {

return false;

}

// All other errors are considered fatal

return true

}

zereraz added 2 commits March 22, 2025 19:01

initial slack commit

076e224

Fix the app filters for search

8fc6cb2

gemini-code-assist bot reviewed Mar 23, 2025

View reviewed changes

gemini-code-assist bot suggested changes Mar 23, 2025

View reviewed changes

remove merge conflict

2e11048

Merge branch 'main' into integrations/slack

926202e

coderabbitai bot reviewed Mar 24, 2025

View reviewed changes

zereraz changed the title ~~feat(integrations: Slack~~ feat(integrations): Slack Mar 24, 2025

wip: Allow slack integration to resume and start again

2027075

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

zereraz mentioned this pull request Mar 28, 2025

remove pg boss for ingestion job and add whitelist emails support for service account #359

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(integrations): Slack #353

feat(integrations): Slack #353

zereraz commented Mar 23, 2025 •

edited by coderabbitai bot

Loading

gemini-code-assist bot left a comment

gemini-code-assist bot left a comment

coderabbitai bot commented Mar 24, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

Key Observations

Flow Comparison

Conclusion

coderabbitai bot left a comment

coderabbitai bot Mar 24, 2025

coderabbitai bot left a comment

Key Implementation Details

Operational Characteristics

coderabbitai bot Mar 28, 2025

coderabbitai bot Mar 28, 2025

coderabbitai bot Mar 28, 2025

coderabbitai bot Mar 28, 2025

coderabbitai bot Mar 28, 2025

coderabbitai bot Mar 28, 2025

coderabbitai bot Mar 28, 2025

Parameter	Slack OAuth	PKCE OAuth
Initial Auth Request	`client_id`, `scope`, `redirect_uri`, `state`[2][4]	`client_id`, `code_challenge`, `code_challenge_method`[1][6][10]
Token Exchange	`client_secret`, `code`[4][8]	`code_verifier` (no `client_secret`)[1][6][10]
Use Case	Web/backend apps with a `client_secret`[4]	Mobile/SPA apps without `client_secret`[1][10]

	console.log("provider", provider)
	Logger.debug("OAuth provider details retrieved successfully")

-function isFatalError(error: any): boolean {
-  // For example, if your error has a code or property indicating it’s fatal,
-  // you can check for that. Here, we assume every error is fatal.
-  // Modify this as needed.
-  return true
-}
+function isFatalError(error: any): boolean {
+  // Non-fatal errors (temporary issues like rate limits)
+  const nonFatalErrorCodes = ['ratelimited', 'service_unavailable', 'timeout'];
+  if (error?.response?.data?.error && nonFatalErrorCodes.includes(error.response.data.error)) {
+    return false;
+  }
+  // Consider network errors as potentially recoverable
+  if (error.code === 'ECONNRESET' || error.code === 'ETIMEDOUT') {
+    return false;
+  }
+  // All other errors are considered fatal
+  return true
+}

feat(integrations): Slack #353

Are you sure you want to change the base?

feat(integrations): Slack #353

Conversation

zereraz commented Mar 23, 2025 • edited by coderabbitai bot Loading

Description

Testing

Additional Notes

Summary by CodeRabbit

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Changelog

Footnotes

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Summary of Findings

Merge Readiness

coderabbitai bot commented Mar 24, 2025 • edited Loading

Walkthrough

Changes

Sequence Diagram(s)

Slack OAuth Callback Flow

API Key Connector Creation Flow

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

Key Observations

Flow Comparison

Conclusion

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Mar 24, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

Key Implementation Details

Operational Characteristics

coderabbitai bot Mar 28, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 28, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 28, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 28, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 28, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 28, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 28, 2025

Choose a reason for hiding this comment

zereraz commented Mar 23, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 24, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)