CarbonAI 0.3.53

CarbonAI 0.3.53

Maintained by Konfig Publisher.



CarbonAI 0.3.53

  • By
  • carbon.ai

Visit Carbon

Connect external data to LLMs, no matter the source.

CocoaPods

Table of Contents

Installation

Swift Package Manager

  1. In Xcode, select File > Add Packages… and enter https://github.com/Carbon-for-Developers/carbon-swift-sdk as the repository URL.
  2. Select the latest version number from our tags page.
  3. Add the CarbonAI product to the target of your app.

Carthage

  1. Add this line to your Cartfile:
github "Carbon-for-Developers/carbon-swift-sdk"
  1. Follow the Carthage installation instructions.
  2. In the future, to update to the latest version of the SDK, run the following command: carthage update carbon-swift-sdk

CocoaPods

  1. Add source 'https://github.com/CocoaPods/Specs.git' to your Podfile
  2. Add pod 'CarbonAI', '~> 0.2.2' to your Podfile

Your Podfile should look like:

# Podfile
source 'https://github.com/CocoaPods/Specs.git'

target 'Example' do
  pod 'CarbonAI', '~> 0.2.2'
end
  1. Run pod install
❯ pod install
Analyzing dependencies
Downloading dependencies
Installing CarbonAI 0.2.2
Generating Pods project
Integrating client project
Pod installation complete! There is 1 dependency from the Podfile and 2 total pods installed.
  1. In the future, to update to the latest version of the SDK, run: pod update CarbonAI

Getting Started

import CarbonAI

// 1) Get an access token for a customer
let carbon = CarbonAIClient(
	accessToken: nil,
    apiKey: "API_KEY",
    customerId: "CUSTOMER_ID"
)

let token = try await carbon.auth.getAccessToken()

// 2) Use the access token to authenticate moving forward
let carbonWithToken = CarbonAIClient(
	accessToken: token!.accessToken,
	apiKey: nil,
	customerId: nil
)

// use SDK as usual
white_labeling = try await carbonWithToken.auth.get_white_labeling()
// etc.

Reference

carbonai.auth.getAccessToken

Get Access Token

🛠️ Usage

let getAccessTokenResponse = try await carbonai.auth.getAccessToken()

🔄 Return

TokenResponse

🌐 Endpoint

/auth/v1/access_token GET

🔙 Back to Table of Contents


carbonai.auth.getWhiteLabeling

Returns whether or not the organization is white labeled and which integrations are white labeled

:param current_user: the current user :param db: the database session :return: a WhiteLabelingResponse

🛠️ Usage

let getWhiteLabelingResponse = try await carbonai.auth.getWhiteLabeling()

🔄 Return

WhiteLabelingResponse

🌐 Endpoint

/auth/v1/white_labeling GET

🔙 Back to Table of Contents


carbonai.dataSources.queryUserDataSources

User Data Sources

🛠️ Usage

let pagination = Pagination(
    limit: 123,
    offset: 123
)
let orderBy = OrganizationUserDataSourceOrderByColumns(
    
)
let orderDir = OrderDir(
    
)
let filters = OrganizationUserDataSourceFilters(
    source: DataSourceTypeNullable.googleDrive,
    ids: [
    123
    ],
    revokedAccess: false
)
let queryUserDataSourcesResponse = try await carbonai.dataSources.queryUserDataSources(
    pagination: pagination,
    orderBy: orderBy,
    orderDir: orderDir,
    filters: filters
)

⚙️ Parameters

pagination: Pagination
order_by: OrganizationUserDataSourceOrderByColumns
order_dir: OrderDir

🔄 Return

OrganizationUserDataSourceResponse

🌐 Endpoint

/user_data_sources POST

🔙 Back to Table of Contents


carbonai.dataSources.revokeAccessToken

Revoke Access Token

🛠️ Usage

let dataSourceId = 987
let revokeAccessTokenResponse = try await carbonai.dataSources.revokeAccessToken(
    dataSourceId: dataSourceId
)

⚙️ Parameters

data_source_id: Int

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/revoke_access_token POST

🔙 Back to Table of Contents


carbonai.embeddings.getDocuments

For pre-filtering documents, using tags_v2 is preferred to using tags (which is now deprecated). If both tags_v2 and tags are specified, tags is ignored. tags_v2 enables building complex filters through the use of "AND", "OR", and negation logic. Take the below input as an example:

{
    "OR": [
        {
            "key": "subject",
            "value": "holy-bible",
            "negate": false
        },
        {
            "key": "person-of-interest",
            "value": "jesus christ",
            "negate": false
        },
        {
            "key": "genre",
            "value": "religion",
            "negate": true
        }
        {
            "AND": [
                {
                    "key": "subject",
                    "value": "tao-te-ching",
                    "negate": false
                },
                {
                    "key": "author",
                    "value": "lao-tzu",
                    "negate": false
                }
            ]
        }
    ]
}

In this case, files will be filtered such that:

  1. "subject" = "holy-bible" OR
  2. "person-of-interest" = "jesus christ" OR
  3. "genre" != "religion" OR
  4. "subject" = "tao-te-ching" AND "author" = "lao-tzu"

Note that the top level of the query must be either an "OR" or "AND" array. Currently, nesting is limited to 3. For tag blocks (those with "key", "value", and "negate" keys), the following typing rules apply:

  1. "key" isn't optional and must be a string
  2. "value" isn't optional and can be any or list[any]
  3. "negate" is optional and must be true or false. If present and true, then the filter block is negated in the resulting query. It is false by default.

When querying embeddings, you can optionally specify the media_type parameter in your request. By default (if not set), it is equal to "TEXT". This means that the query will be performed over files that have been parsed as text (for now, this covers all files except image files). If it is equal to "IMAGE", the query will be performed over image files (for now, .jpg and .png files). You can think of this field as an additional filter on top of any filters set in file_ids and

When hybrid_search is set to true, a combination of keyword search and semantic search are used to rank and select candidate embeddings during information retrieval. By default, these search methods are weighted equally during the ranking process. To adjust the weight (or "importance") of each search method, you can use the hybrid_search_tuning_parameters property. The description for the different tuning parameters are:

  • weight_a: weight to assign to semantic search
  • weight_b: weight to assign to keyword search

You must ensure that sum(weight_a, weight_b,..., weight_n) for all n weights is equal to 1. The equality has an error tolerance of 0.001 to account for possible floating point issues.

In order to use hybrid search for a customer across a set of documents, two flags need to be enabled:

  1. Use the /modify_user_configuration endpoint to to enable sparse_vectors for the customer. The payload body for this request is below:
{
  "configuration_key_name": "sparse_vectors",
  "value": {
    "enabled": true
  }
}
  1. Make sure hybrid search is enabled for the documents across which you want to perform the search. For the /uploadfile endpoint, this can be done by setting the following query parameter: generate_sparse_vectors=true

Carbon supports multiple models for use in generating embeddings for files. For images, we support Vertex AI's multimodal model; for text, we support OpenAI's text-embedding-ada-002 and Cohere's embed-multilingual-v3.0. The model can be specified via the embedding_model parameter (in the POST body for /embeddings, and a query parameter in /uploadfile). If no model is supplied, the text-embedding-ada-002 is used by default. When performing embedding queries, embeddings from files that used the specified model will be considered in the query. For example, if files A and B have embeddings generated with OPENAI, and files C and D have embeddings generated with COHERE_MULTILINGUAL_V3, then by default, queries will only consider files A and B. If COHERE_MULTILINGUAL_V3 is specified as the embedding_model in /embeddings, then only files C and D will be considered. Make sure that the set of all files you want considered for a query have embeddings generated via the same model. For now, do not set VERTEX_MULTIMODAL as an embedding_model. This model is used automatically by Carbon when it detects an image file.

🛠️ Usage

let query = "query_example"
let k = 987
let tags = "TODO"
let queryVector = [
123
]
let fileIds = [
123
]
let parentFileIds = [
123
]
let tagsV2 = "TODO"
let includeTags = true
let includeVectors = true
let includeRawFile = true
let hybridSearch = true
let hybridSearchTuningParameters = HybridSearchTuningParamsNullable(
    weightA: 123,
    weightB: 123
)
let mediaType = FileContentTypesNullable(
    
)
let embeddingModel = EmbeddingGeneratorsNullable(
    
)
let getDocumentsResponse = try await carbonai.embeddings.getDocuments(
    query: query,
    k: k,
    tags: tags,
    queryVector: queryVector,
    fileIds: fileIds,
    parentFileIds: parentFileIds,
    tagsV2: tagsV2,
    includeTags: includeTags,
    includeVectors: includeVectors,
    includeRawFile: includeRawFile,
    hybridSearch: hybridSearch,
    hybridSearchTuningParameters: hybridSearchTuningParameters,
    mediaType: mediaType,
    embeddingModel: embeddingModel
)

⚙️ Parameters

query: String

Query for which to get related chunks and embeddings.

k: Int

Number of related chunks to return.

tags: [String: Tags1]

A set of tags to limit the search to. Deprecated and may be removed in the future.

query_vector: [Double]

Optional query vector for which to get related chunks and embeddings. It must have been generated by the same model used to generate the embeddings across which the search is being conducted. Cannot provide both query and query_vector.

file_ids: [Int]

Optional list of file IDs to limit the search to

parent_file_ids: [Int]

Optional list of parent file IDs to limit the search to. A parent file describes a file to which another file belongs (e.g. a folder)

tags_v2: AnyCodable

A set of tags to limit the search to. Use this instead of tags, which is deprecated.

include_tags: Bool

Flag to control whether or not to include tags for each chunk in the response.

include_vectors: Bool

Flag to control whether or not to include embedding vectors in the response.

include_raw_file: Bool

Flag to control whether or not to include a signed URL to the raw file containing each chunk in the response.

hybrid_search: Bool

Flag to control whether or not to perform hybrid search.

hybrid_search_tuning_parameters: HybridSearchTuningParamsNullable
media_type: FileContentTypesNullable
embedding_model: EmbeddingGeneratorsNullable

🔄 Return

DocumentResponseList

🌐 Endpoint

/embeddings POST

🔙 Back to Table of Contents


carbonai.embeddings.getEmbeddingsAndChunks

Retrieve Embeddings And Content

🛠️ Usage

let filters = EmbeddingsAndChunksFilters(
    userFileId: 123,
    embeddingModel: EmbeddingGeneratorsNullable.openai
)
let pagination = Pagination(
    limit: 123,
    offset: 123
)
let orderBy = EmbeddingsAndChunksOrderByColumns(
    
)
let orderDir = OrderDir(
    
)
let includeVectors = true
let getEmbeddingsAndChunksResponse = try await carbonai.embeddings.getEmbeddingsAndChunks(
    filters: filters,
    pagination: pagination,
    orderBy: orderBy,
    orderDir: orderDir,
    includeVectors: includeVectors
)

⚙️ Parameters

pagination: Pagination
order_by: EmbeddingsAndChunksOrderByColumns
order_dir: OrderDir
include_vectors: Bool

🔄 Return

EmbeddingsAndChunksResponse

🌐 Endpoint

/text_chunks POST

🔙 Back to Table of Contents


carbonai.embeddings.uploadChunksAndEmbeddings

Upload Chunks And Embeddings

🛠️ Usage

let embeddingModel = EmbeddingGenerators(
    
)
let chunksAndEmbeddings = [
SingleChunksAndEmbeddingsUploadInput(
    fileId: 123,
    chunkSize: 123,
    chunkOverlap: 123,
    chunksAndEmbeddings: [
    ChunksAndEmbeddings(
        chunkNumber: 123,
        chunk: "chunk_example",
        embedding: [
        123
        ]
    )
    ]
)
]
let overwriteExisting = true
let chunksOnly = true
let customCredentials = "TODO"
let uploadChunksAndEmbeddingsResponse = try await carbonai.embeddings.uploadChunksAndEmbeddings(
    embeddingModel: embeddingModel,
    chunksAndEmbeddings: chunksAndEmbeddings,
    overwriteExisting: overwriteExisting,
    chunksOnly: chunksOnly,
    customCredentials: customCredentials
)

⚙️ Parameters

embedding_model: EmbeddingGenerators
chunks_and_embeddings: [SingleChunksAndEmbeddingsUploadInput]
overwrite_existing: Bool
chunks_only: Bool
custom_credentials: AnyCodable

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/upload_chunks_and_embeddings POST

🔙 Back to Table of Contents


carbonai.files.createUserFileTags

A tag is a key-value pair that can be added to a file. This pair can then be used for searches (e.g. embedding searches) in order to narrow down the scope of the search. A file can have any number of tags. The following are reserved keys that cannot be used:

  • db_embedding_id
  • organization_id
  • user_id
  • organization_user_file_id

Carbon currently supports two data types for tag values - string and list<string>. Keys can only be string. If values other than string and list<string> are used, they're automatically converted to strings (e.g. 4 will become "4").

🛠️ Usage

let tags = "TODO"
let organizationUserFileId = 987
let createUserFileTagsResponse = try await carbonai.files.createUserFileTags(
    tags: tags,
    organizationUserFileId: organizationUserFileId
)

⚙️ Parameters

tags: [String: Tags1]
organization_user_file_id: Int

🔄 Return

UserFile

🌐 Endpoint

/create_user_file_tags POST

🔙 Back to Table of Contents


carbonai.files.delete

Delete File Endpoint

🛠️ Usage

let fileId = 987
let deleteResponse = try await carbonai.files.delete(
    fileId: fileId
)

⚙️ Parameters

fileId: Int

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/deletefile/{file_id} DELETE

🔙 Back to Table of Contents


carbonai.files.deleteFileTags

Delete File Tags

🛠️ Usage

let tags = [
"inner_example"
]
let organizationUserFileId = 987
let deleteFileTagsResponse = try await carbonai.files.deleteFileTags(
    tags: tags,
    organizationUserFileId: organizationUserFileId
)

⚙️ Parameters

tags: [String]
organization_user_file_id: Int

🔄 Return

UserFile

🌐 Endpoint

/delete_user_file_tags POST

🔙 Back to Table of Contents


carbonai.files.deleteMany

Delete Files Endpoint

🛠️ Usage

let fileIds = [
123
]
let syncStatuses = [
ExternalFileSyncStatuses.delayed
]
let deleteNonSyncedOnly = true
let sendWebhook = true
let deleteChildFiles = true
let deleteManyResponse = try await carbonai.files.deleteMany(
    fileIds: fileIds,
    syncStatuses: syncStatuses,
    deleteNonSyncedOnly: deleteNonSyncedOnly,
    sendWebhook: sendWebhook,
    deleteChildFiles: deleteChildFiles
)

⚙️ Parameters

file_ids: [Int]
sync_statuses: [ExternalFileSyncStatuses]
delete_non_synced_only: Bool
send_webhook: Bool
delete_child_files: Bool

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/delete_files POST

🔙 Back to Table of Contents


carbonai.files.deleteV2

Delete Files V2 Endpoint

🛠️ Usage

let filters = OrganizationUserFilesToSyncFilters(
    tags: "TODO",
    source: SourceProperty(
        
    ),
    name: "name_example",
    tagsV2: "TODO",
    ids: [
    123
    ],
    externalFileIds: [
    "externalFileIds_example"
    ],
    syncStatuses: [
    ExternalFileSyncStatuses.delayed
    ],
    parentFileIds: [
    123
    ],
    organizationUserDataSourceId: [
    123
    ],
    embeddingGenerators: [
    EmbeddingGenerators.openai
    ],
    rootFilesOnly: false,
    includeAllChildren: false,
    nonSyncedOnly: false,
    requestIds: [
    "requestIds_example"
    ]
)
let sendWebhook = true
let deleteV2Response = try await carbonai.files.deleteV2(
    filters: filters,
    sendWebhook: sendWebhook
)

⚙️ Parameters

send_webhook: Bool

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/delete_files_v2 POST

🔙 Back to Table of Contents


carbonai.files.getParsedFile

This route is deprecated. Use /user_files_v2 instead.

🛠️ Usage

let fileId = 987
let getParsedFileResponse = try await carbonai.files.getParsedFile(
    fileId: fileId
)

⚙️ Parameters

fileId: Int

🔄 Return

PresignedURLResponse

🌐 Endpoint

/parsed_file/{file_id} GET

🔙 Back to Table of Contents


carbonai.files.getRawFile

This route is deprecated. Use /user_files_v2 instead.

🛠️ Usage

let fileId = 987
let getRawFileResponse = try await carbonai.files.getRawFile(
    fileId: fileId
)

⚙️ Parameters

fileId: Int

🔄 Return

PresignedURLResponse

🌐 Endpoint

/raw_file/{file_id} GET

🔙 Back to Table of Contents


carbonai.files.queryUserFiles

For pre-filtering documents, using tags_v2 is preferred to using tags (which is now deprecated). If both tags_v2 and tags are specified, tags is ignored. tags_v2 enables building complex filters through the use of "AND", "OR", and negation logic. Take the below input as an example:

{
    "OR": [
        {
            "key": "subject",
            "value": "holy-bible",
            "negate": false
        },
        {
            "key": "person-of-interest",
            "value": "jesus christ",
            "negate": false
        },
        {
            "key": "genre",
            "value": "religion",
            "negate": true
        }
        {
            "AND": [
                {
                    "key": "subject",
                    "value": "tao-te-ching",
                    "negate": false
                },
                {
                    "key": "author",
                    "value": "lao-tzu",
                    "negate": false
                }
            ]
        }
    ]
}

In this case, files will be filtered such that:

  1. "subject" = "holy-bible" OR
  2. "person-of-interest" = "jesus christ" OR
  3. "genre" != "religion" OR
  4. "subject" = "tao-te-ching" AND "author" = "lao-tzu"

Note that the top level of the query must be either an "OR" or "AND" array. Currently, nesting is limited to 3. For tag blocks (those with "key", "value", and "negate" keys), the following typing rules apply:

  1. "key" isn't optional and must be a string
  2. "value" isn't optional and can be any or list[any]
  3. "negate" is optional and must be true or false. If present and true, then the filter block is negated in the resulting query. It is false by default.

🛠️ Usage

let pagination = Pagination(
    limit: 123,
    offset: 123
)
let orderBy = OrganizationUserFilesToSyncOrderByTypes(
    
)
let orderDir = OrderDir(
    
)
let filters = OrganizationUserFilesToSyncFilters(
    tags: "TODO",
    source: SourceProperty(
        
    ),
    name: "name_example",
    tagsV2: "TODO",
    ids: [
    123
    ],
    externalFileIds: [
    "externalFileIds_example"
    ],
    syncStatuses: [
    ExternalFileSyncStatuses.delayed
    ],
    parentFileIds: [
    123
    ],
    organizationUserDataSourceId: [
    123
    ],
    embeddingGenerators: [
    EmbeddingGenerators.openai
    ],
    rootFilesOnly: false,
    includeAllChildren: false,
    nonSyncedOnly: false,
    requestIds: [
    "requestIds_example"
    ]
)
let includeRawFile = true
let includeParsedTextFile = true
let includeAdditionalFiles = true
let queryUserFilesResponse = try await carbonai.files.queryUserFiles(
    pagination: pagination,
    orderBy: orderBy,
    orderDir: orderDir,
    filters: filters,
    includeRawFile: includeRawFile,
    includeParsedTextFile: includeParsedTextFile,
    includeAdditionalFiles: includeAdditionalFiles
)

⚙️ Parameters

pagination: Pagination
order_by: OrganizationUserFilesToSyncOrderByTypes
order_dir: OrderDir
include_raw_file: Bool
include_parsed_text_file: Bool
include_additional_files: Bool

🔄 Return

UserFilesV2

🌐 Endpoint

/user_files_v2 POST

🔙 Back to Table of Contents


carbonai.files.queryUserFilesDeprecated

This route is deprecated. Use /user_files_v2 instead.

🛠️ Usage

let pagination = Pagination(
    limit: 123,
    offset: 123
)
let orderBy = OrganizationUserFilesToSyncOrderByTypes(
    
)
let orderDir = OrderDir(
    
)
let filters = OrganizationUserFilesToSyncFilters(
    tags: "TODO",
    source: SourceProperty(
        
    ),
    name: "name_example",
    tagsV2: "TODO",
    ids: [
    123
    ],
    externalFileIds: [
    "externalFileIds_example"
    ],
    syncStatuses: [
    ExternalFileSyncStatuses.delayed
    ],
    parentFileIds: [
    123
    ],
    organizationUserDataSourceId: [
    123
    ],
    embeddingGenerators: [
    EmbeddingGenerators.openai
    ],
    rootFilesOnly: false,
    includeAllChildren: false,
    nonSyncedOnly: false,
    requestIds: [
    "requestIds_example"
    ]
)
let includeRawFile = true
let includeParsedTextFile = true
let includeAdditionalFiles = true
let queryUserFilesDeprecatedResponse = try await carbonai.files.queryUserFilesDeprecated(
    pagination: pagination,
    orderBy: orderBy,
    orderDir: orderDir,
    filters: filters,
    includeRawFile: includeRawFile,
    includeParsedTextFile: includeParsedTextFile,
    includeAdditionalFiles: includeAdditionalFiles
)

⚙️ Parameters

pagination: Pagination
order_by: OrganizationUserFilesToSyncOrderByTypes
order_dir: OrderDir
include_raw_file: Bool
include_parsed_text_file: Bool
include_additional_files: Bool

🔄 Return

UserFile

🌐 Endpoint

/user_files POST

🔙 Back to Table of Contents


carbonai.files.resync

Resync File

🛠️ Usage

let fileId = 987
let chunkSize = 987
let chunkOverlap = 987
let forceEmbeddingGeneration = true
let resyncResponse = try await carbonai.files.resync(
    fileId: fileId,
    chunkSize: chunkSize,
    chunkOverlap: chunkOverlap,
    forceEmbeddingGeneration: forceEmbeddingGeneration
)

⚙️ Parameters

file_id: Int
chunk_size: Int
chunk_overlap: Int
force_embedding_generation: Bool

🔄 Return

UserFile

🌐 Endpoint

/resync_file POST

🔙 Back to Table of Contents


carbonai.files.upload

This endpoint is used to directly upload local files to Carbon. The POST request should be a multipart form request. Note that the set_page_as_boundary query parameter is applicable only to PDFs for now. When this value is set, PDF chunks are at most one page long. Additional information can be retrieved for each chunk, however, namely the coordinates of the bounding box around the chunk (this can be used for things like text highlighting). Following is a description of all possible query parameters:

  • chunk_size: the chunk size (in tokens) applied when splitting the document
  • chunk_overlap: the chunk overlap (in tokens) applied when splitting the document
  • skip_embedding_generation: whether or not to skip the generation of chunks and embeddings
  • set_page_as_boundary: described above
  • embedding_model: the model used to generate embeddings for the document chunks
  • use_ocr: whether or not to use OCR as a preprocessing step prior to generating chunks (only valid for PDFs currently)
  • generate_sparse_vectors: whether or not to generate sparse vectors for the file. Required for hybrid search.
  • prepend_filename_to_chunks: whether or not to prepend the filename to the chunk text

Carbon supports multiple models for use in generating embeddings for files. For images, we support Vertex AI's multimodal model; for text, we support OpenAI's text-embedding-ada-002 and Cohere's embed-multilingual-v3.0. The model can be specified via the embedding_model parameter (in the POST body for /embeddings, and a query parameter in /uploadfile). If no model is supplied, the text-embedding-ada-002 is used by default. When performing embedding queries, embeddings from files that used the specified model will be considered in the query. For example, if files A and B have embeddings generated with OPENAI, and files C and D have embeddings generated with COHERE_MULTILINGUAL_V3, then by default, queries will only consider files A and B. If COHERE_MULTILINGUAL_V3 is specified as the embedding_model in /embeddings, then only files C and D will be considered. Make sure that the set of all files you want considered for a query have embeddings generated via the same model. For now, do not set VERTEX_MULTIMODAL as an embedding_model. This model is used automatically by Carbon when it detects an image file.

🛠️ Usage

let file = URL(string: "https://example.com")!
let chunkSize = 987
let chunkOverlap = 987
let skipEmbeddingGeneration = false
let setPageAsBoundary = false
let embeddingModel = TextEmbeddingGenerators(
    
)
let useOcr = false
let generateSparseVectors = false
let prependFilenameToChunks = false
let maxItemsPerChunk = 987
let parsePdfTablesWithOcr = false
let uploadResponse = try await carbonai.files.upload(
    file: file,
    chunkSize: chunkSize,
    chunkOverlap: chunkOverlap,
    skipEmbeddingGeneration: skipEmbeddingGeneration,
    setPageAsBoundary: setPageAsBoundary,
    embeddingModel: embeddingModel,
    useOcr: useOcr,
    generateSparseVectors: generateSparseVectors,
    prependFilenameToChunks: prependFilenameToChunks,
    maxItemsPerChunk: maxItemsPerChunk,
    parsePdfTablesWithOcr: parsePdfTablesWithOcr
)

⚙️ Parameters

file: URL
chunkSize: Int

Chunk size in tiktoken tokens to be used when processing file.

chunkOverlap: Int

Chunk overlap in tiktoken tokens to be used when processing file.

skipEmbeddingGeneration: Bool

Flag to control whether or not embeddings should be generated and stored when processing file.

setPageAsBoundary: Bool

Flag to control whether or not to set the a page's worth of content as the maximum amount of content that can appear in a chunk. Only valid for PDFs. See description route description for more information.

embeddingModel: TextEmbeddingGenerators

Embedding model that will be used to embed file chunks.

useOcr: Bool

Whether or not to use OCR when processing files. Only valid for PDFs. Useful for documents with tables, images, and/or scanned text.

generateSparseVectors: Bool

Whether or not to generate sparse vectors for the file. This is required for the file to be a candidate for hybrid search.

prependFilenameToChunks: Bool

Whether or not to prepend the file's name to chunks.

maxItemsPerChunk: Int

Number of objects per chunk. For csv, tsv, xlsx, and json files only.

parsePdfTablesWithOcr: Bool

Whether to use rich table parsing when use_ocr is enabled.

🔄 Return

UserFile

🌐 Endpoint

/uploadfile POST

🔙 Back to Table of Contents


carbonai.files.uploadFromUrl

Create Upload File From Url

🛠️ Usage

let url = "url_example"
let fileName = "fileName_example"
let chunkSize = 987
let chunkOverlap = 987
let skipEmbeddingGeneration = true
let setPageAsBoundary = true
let embeddingModel = EmbeddingGenerators(
    
)
let generateSparseVectors = true
let useTextract = true
let prependFilenameToChunks = true
let maxItemsPerChunk = 987
let parsePdfTablesWithOcr = true
let uploadFromUrlResponse = try await carbonai.files.uploadFromUrl(
    url: url,
    fileName: fileName,
    chunkSize: chunkSize,
    chunkOverlap: chunkOverlap,
    skipEmbeddingGeneration: skipEmbeddingGeneration,
    setPageAsBoundary: setPageAsBoundary,
    embeddingModel: embeddingModel,
    generateSparseVectors: generateSparseVectors,
    useTextract: useTextract,
    prependFilenameToChunks: prependFilenameToChunks,
    maxItemsPerChunk: maxItemsPerChunk,
    parsePdfTablesWithOcr: parsePdfTablesWithOcr
)

⚙️ Parameters

url: String
file_name: String
chunk_size: Int
chunk_overlap: Int
skip_embedding_generation: Bool
set_page_as_boundary: Bool
embedding_model: EmbeddingGenerators
generate_sparse_vectors: Bool
use_textract: Bool
prepend_filename_to_chunks: Bool
max_items_per_chunk: Int

Number of objects per chunk. For csv, tsv, xlsx, and json files only.

parse_pdf_tables_with_ocr: Bool

🔄 Return

UserFile

🌐 Endpoint

/upload_file_from_url POST

🔙 Back to Table of Contents


carbonai.files.uploadText

Carbon supports multiple models for use in generating embeddings for files. For images, we support Vertex AI's multimodal model; for text, we support OpenAI's text-embedding-ada-002 and Cohere's embed-multilingual-v3.0. The model can be specified via the embedding_model parameter (in the POST body for /embeddings, and a query parameter in /uploadfile). If no model is supplied, the text-embedding-ada-002 is used by default. When performing embedding queries, embeddings from files that used the specified model will be considered in the query. For example, if files A and B have embeddings generated with OPENAI, and files C and D have embeddings generated with COHERE_MULTILINGUAL_V3, then by default, queries will only consider files A and B. If COHERE_MULTILINGUAL_V3 is specified as the embedding_model in /embeddings, then only files C and D will be considered. Make sure that the set of all files you want considered for a query have embeddings generated via the same model. For now, do not set VERTEX_MULTIMODAL as an embedding_model. This model is used automatically by Carbon when it detects an image file.

🛠️ Usage

let contents = "contents_example"
let name = "name_example"
let chunkSize = 987
let chunkOverlap = 987
let skipEmbeddingGeneration = true
let overwriteFileId = 987
let embeddingModel = EmbeddingGeneratorsNullable(
    
)
let generateSparseVectors = true
let uploadTextResponse = try await carbonai.files.uploadText(
    contents: contents,
    name: name,
    chunkSize: chunkSize,
    chunkOverlap: chunkOverlap,
    skipEmbeddingGeneration: skipEmbeddingGeneration,
    overwriteFileId: overwriteFileId,
    embeddingModel: embeddingModel,
    generateSparseVectors: generateSparseVectors
)

⚙️ Parameters

contents: String
name: String
chunk_size: Int
chunk_overlap: Int
skip_embedding_generation: Bool
overwrite_file_id: Int
embedding_model: EmbeddingGeneratorsNullable
generate_sparse_vectors: Bool

🔄 Return

UserFile

🌐 Endpoint

/upload_text POST

🔙 Back to Table of Contents


carbonai.health.check

Health

🛠️ Usage

let checkResponse = try await carbonai.health.check()

🌐 Endpoint

/health GET

🔙 Back to Table of Contents


carbonai.integrations.connectDataSource

Connect Data Source

🛠️ Usage

let authentication = AuthenticationProperty(
    source: "TODO",
    accessToken: "accessToken_example",
    refreshToken: "refreshToken_example",
    workspaceId: "workspaceId_example",
    tenantName: "tenantName_example",
    siteName: "siteName_example",
    subdomain: "subdomain_example",
    accessTokenSecret: "accessTokenSecret_example",
    username: "username_example",
    zoteroId: "zoteroId_example",
    organizationName: "organizationName_example",
    domain: "domain_example",
    apiKey: "apiKey_example",
    accessKey: "accessKey_example",
    accessKeySecret: "accessKeySecret_example"
)
let syncOptions = SyncOptions(
    tags: "TODO",
    chunkSize: 123,
    chunkOverlap: 123,
    skipEmbeddingGeneration: false,
    embeddingModel: EmbeddingGeneratorsNullable.openai,
    generateSparseVectors: false,
    prependFilenameToChunks: false,
    maxItemsPerChunk: 123,
    syncFilesOnConnection: true,
    setPageAsBoundary: false
)
let connectDataSourceResponse = try await carbonai.integrations.connectDataSource(
    authentication: authentication,
    syncOptions: syncOptions
)

⚙️ Parameters

authentication: AuthenticationProperty
sync_options: SyncOptions

🔄 Return

ConnectDataSourceResponse

🌐 Endpoint

/integrations/connect POST

🔙 Back to Table of Contents


carbonai.integrations.connectFreshdesk

Refer this article to obtain an API key https://support.freshdesk.com/en/support/solutions/articles/215517. Make sure that your API key has the permission to read solutions from your account and you are on a paid plan. Once you have an API key, you can make a request to this endpoint along with your freshdesk domain. This will trigger an automatic sync of the articles in your "solutions" tab. Additional parameters below can be used to associate data with the synced articles or modify the sync behavior.

🛠️ Usage

let domain = "domain_example"
let apiKey = "apiKey_example"
let tags = "TODO"
let chunkSize = 987
let chunkOverlap = 987
let skipEmbeddingGeneration = true
let embeddingModel = EmbeddingGeneratorsNullable(
    
)
let generateSparseVectors = true
let prependFilenameToChunks = true
let syncFilesOnConnection = true
let requestId = "requestId_example"
let connectFreshdeskResponse = try await carbonai.integrations.connectFreshdesk(
    domain: domain,
    apiKey: apiKey,
    tags: tags,
    chunkSize: chunkSize,
    chunkOverlap: chunkOverlap,
    skipEmbeddingGeneration: skipEmbeddingGeneration,
    embeddingModel: embeddingModel,
    generateSparseVectors: generateSparseVectors,
    prependFilenameToChunks: prependFilenameToChunks,
    syncFilesOnConnection: syncFilesOnConnection,
    requestId: requestId
)

⚙️ Parameters

domain: String
api_key: String
tags: AnyCodable
chunk_size: Int
chunk_overlap: Int
skip_embedding_generation: Bool
embedding_model: EmbeddingGeneratorsNullable
generate_sparse_vectors: Bool
prepend_filename_to_chunks: Bool
sync_files_on_connection: Bool
request_id: String

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/integrations/freshdesk POST

🔙 Back to Table of Contents


carbonai.integrations.connectGitbook

You will need an access token to connect your Gitbook account. Note that the permissions will be defined by the user generating access token so make sure you have the permission to access spaces you will be syncing. Refer this article for more details https://developer.gitbook.com/gitbook-api/authentication. Additionally, you need to specify the name of organization you will be syncing data from.

🛠️ Usage

let organization = "organization_example"
let accessToken = "accessToken_example"
let tags = "TODO"
let chunkSize = 987
let chunkOverlap = 987
let skipEmbeddingGeneration = true
let embeddingModel = EmbeddingGenerators(
    
)
let generateSparseVectors = true
let prependFilenameToChunks = true
let syncFilesOnConnection = true
let requestId = "requestId_example"
let connectGitbookResponse = try await carbonai.integrations.connectGitbook(
    organization: organization,
    accessToken: accessToken,
    tags: tags,
    chunkSize: chunkSize,
    chunkOverlap: chunkOverlap,
    skipEmbeddingGeneration: skipEmbeddingGeneration,
    embeddingModel: embeddingModel,
    generateSparseVectors: generateSparseVectors,
    prependFilenameToChunks: prependFilenameToChunks,
    syncFilesOnConnection: syncFilesOnConnection,
    requestId: requestId
)

⚙️ Parameters

organization: String
access_token: String
tags: AnyCodable
chunk_size: Int
chunk_overlap: Int
skip_embedding_generation: Bool
embedding_model: EmbeddingGenerators
generate_sparse_vectors: Bool
prepend_filename_to_chunks: Bool
sync_files_on_connection: Bool
request_id: String

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/integrations/gitbook POST

🔙 Back to Table of Contents


carbonai.integrations.createAwsIamUser

Create a new IAM user with permissions to:

  1. List all buckets.
  2. Read from the specific buckets and objects to sync with Carbon. Ensure any future buckets or objects carry the same permissions.
Once created, generate an access key for this user and share the credentials with us. We recommend testing this key beforehand.

🛠️ Usage

let accessKey = "accessKey_example"
let accessKeySecret = "accessKeySecret_example"
let createAwsIamUserResponse = try await carbonai.integrations.createAwsIamUser(
    accessKey: accessKey,
    accessKeySecret: accessKeySecret
)

⚙️ Parameters

access_key: String
access_key_secret: String

🔄 Return

OrganizationUserDataSourceAPI

🌐 Endpoint

/integrations/s3 POST

🔙 Back to Table of Contents


carbonai.integrations.getOauthUrl

This endpoint can be used to generate the following URLs

  • An OAuth URL for OAuth based connectors
  • A file syncing URL which skips the OAuth flow if the user already has a valid access token and takes them to the success state.

🛠️ Usage

let service = DataSourceType(
    
)
let tags = TODO
let scope = "scope_example"
let chunkSize = 987
let chunkOverlap = 987
let skipEmbeddingGeneration = true
let embeddingModel = EmbeddingGeneratorsNullable(
    
)
let zendeskSubdomain = "zendeskSubdomain_example"
let microsoftTenant = "microsoftTenant_example"
let sharepointSiteName = "sharepointSiteName_example"
let confluenceSubdomain = "confluenceSubdomain_example"
let generateSparseVectors = true
let prependFilenameToChunks = true
let maxItemsPerChunk = 987
let salesforceDomain = "salesforceDomain_example"
let syncFilesOnConnection = true
let setPageAsBoundary = true
let dataSourceId = 987
let connectingNewAccount = true
let requestId = "requestId_example"
let useOcr = true
let parsePdfTablesWithOcr = true
let getOauthUrlResponse = try await carbonai.integrations.getOauthUrl(
    service: service,
    tags: tags,
    scope: scope,
    chunkSize: chunkSize,
    chunkOverlap: chunkOverlap,
    skipEmbeddingGeneration: skipEmbeddingGeneration,
    embeddingModel: embeddingModel,
    zendeskSubdomain: zendeskSubdomain,
    microsoftTenant: microsoftTenant,
    sharepointSiteName: sharepointSiteName,
    confluenceSubdomain: confluenceSubdomain,
    generateSparseVectors: generateSparseVectors,
    prependFilenameToChunks: prependFilenameToChunks,
    maxItemsPerChunk: maxItemsPerChunk,
    salesforceDomain: salesforceDomain,
    syncFilesOnConnection: syncFilesOnConnection,
    setPageAsBoundary: setPageAsBoundary,
    dataSourceId: dataSourceId,
    connectingNewAccount: connectingNewAccount,
    requestId: requestId,
    useOcr: useOcr,
    parsePdfTablesWithOcr: parsePdfTablesWithOcr
)

⚙️ Parameters

service: DataSourceType
tags: AnyCodable
scope: String
chunk_size: Int
chunk_overlap: Int
skip_embedding_generation: Bool
embedding_model: EmbeddingGeneratorsNullable
zendesk_subdomain: String
microsoft_tenant: String
sharepoint_site_name: String
confluence_subdomain: String
generate_sparse_vectors: Bool
prepend_filename_to_chunks: Bool
max_items_per_chunk: Int

Number of objects per chunk. For csv, tsv, xlsx, and json files only.

salesforce_domain: String
sync_files_on_connection: Bool

Used to specify whether Carbon should attempt to sync all your files automatically when authorization is complete. This is only supported for a subset of connectors and will be ignored for the rest. Supported connectors: Intercom, Zendesk, Gitbook, Confluence, Salesforce, Freshdesk

set_page_as_boundary: Bool
data_source_id: Int

Used to specify a data source to sync from if you have multiple connected. It can be skipped if you only have one data source of that type connected or are connecting a new account.

connecting_new_account: Bool

Used to connect a new data source. If not specified, we will attempt to create a sync URL for an existing data source based on type and ID.

request_id: String

This request id will be added to all files that get synced using the generated OAuth URL

use_ocr: Bool

Enable OCR for files that support it. Supported formats: pdf

parse_pdf_tables_with_ocr: Bool

🔄 Return

OuthURLResponse

🌐 Endpoint

/integrations/oauth_url POST

🔙 Back to Table of Contents


carbonai.integrations.listConfluencePages

To begin listing a user's Confluence pages, at least a data_source_id of a connected Confluence account must be specified. This base request returns a list of root pages for every space the user has access to in a Confluence instance. To traverse further down the user's page directory, additional requests to this endpoint can be made with the same data_source_id and with parent_id set to the id of page from a previous request. For convenience, the has_children property in each directory item in the response list will flag which pages will return non-empty lists of pages when set as the parent_id.

🛠️ Usage

let dataSourceId = 987
let parentId = "parentId_example"
let listConfluencePagesResponse = try await carbonai.integrations.listConfluencePages(
    dataSourceId: dataSourceId,
    parentId: parentId
)

⚙️ Parameters

data_source_id: Int
parent_id: String

🔄 Return

ListResponse

🌐 Endpoint

/integrations/confluence/list POST

🔙 Back to Table of Contents


carbonai.integrations.listDataSourceItems

List Data Source Items

🛠️ Usage

let dataSourceId = 987
let parentId = "parentId_example"
let filters = ListItemsFiltersNullable(
    externalIds: [
    "externalIds_example"
    ],
    ids: [
    123
    ],
    name: "name_example",
    rootFilesOnly: false
)
let pagination = Pagination(
    limit: 123,
    offset: 123
)
let listDataSourceItemsResponse = try await carbonai.integrations.listDataSourceItems(
    dataSourceId: dataSourceId,
    parentId: parentId,
    filters: filters,
    pagination: pagination
)

⚙️ Parameters

data_source_id: Int
parent_id: String
pagination: Pagination

🔄 Return

ListDataSourceItemsResponse

🌐 Endpoint

/integrations/items/list POST

🔙 Back to Table of Contents


carbonai.integrations.listFolders

After connecting your Outlook account, you can use this endpoint to list all of your folders on outlook. This includes both system folders like "inbox" and user created folders.

🛠️ Usage

let dataSourceId = 987
let listFoldersResponse = try await carbonai.integrations.listFolders(
    dataSourceId: dataSourceId
)

⚙️ Parameters

dataSourceId: Int

🌐 Endpoint

/integrations/outlook/user_folders GET

🔙 Back to Table of Contents


carbonai.integrations.listGitbookSpaces

After connecting your Gitbook account, you can use this endpoint to list all of your spaces under current organization.

🛠️ Usage

let dataSourceId = 987
let listGitbookSpacesResponse = try await carbonai.integrations.listGitbookSpaces(
    dataSourceId: dataSourceId
)

⚙️ Parameters

dataSourceId: Int

🌐 Endpoint

/integrations/gitbook/spaces GET

🔙 Back to Table of Contents


carbonai.integrations.listLabels

After connecting your Gmail account, you can use this endpoint to list all of your labels. User created labels will have the type "user" and Gmail's default labels will have the type "system"

🛠️ Usage

let dataSourceId = 987
let listLabelsResponse = try await carbonai.integrations.listLabels(
    dataSourceId: dataSourceId
)

⚙️ Parameters

dataSourceId: Int

🌐 Endpoint

/integrations/gmail/user_labels GET

🔙 Back to Table of Contents


carbonai.integrations.listOutlookCategories

After connecting your Outlook account, you can use this endpoint to list all of your categories on outlook. We currently support listing up to 250 categories.

🛠️ Usage

let dataSourceId = 987
let listOutlookCategoriesResponse = try await carbonai.integrations.listOutlookCategories(
    dataSourceId: dataSourceId
)

⚙️ Parameters

dataSourceId: Int

🌐 Endpoint

/integrations/outlook/user_categories GET

🔙 Back to Table of Contents


carbonai.integrations.syncConfluence

After listing pages in a user's Confluence account, the set of selected page ids and the connected account's data_source_id can be passed into this endpoint to sync them into Carbon. Additional parameters listed below can be used to associate data to the selected pages or alter the behavior of the sync.

🛠️ Usage

let dataSourceId = 987
let ids = IdsProperty(
    
)
let tags = "TODO"
let chunkSize = 987
let chunkOverlap = 987
let skipEmbeddingGeneration = true
let embeddingModel = EmbeddingGeneratorsNullable(
    
)
let generateSparseVectors = true
let prependFilenameToChunks = true
let maxItemsPerChunk = 987
let setPageAsBoundary = true
let requestId = "requestId_example"
let useOcr = true
let parsePdfTablesWithOcr = true
let syncConfluenceResponse = try await carbonai.integrations.syncConfluence(
    dataSourceId: dataSourceId,
    ids: ids,
    tags: tags,
    chunkSize: chunkSize,
    chunkOverlap: chunkOverlap,
    skipEmbeddingGeneration: skipEmbeddingGeneration,
    embeddingModel: embeddingModel,
    generateSparseVectors: generateSparseVectors,
    prependFilenameToChunks: prependFilenameToChunks,
    maxItemsPerChunk: maxItemsPerChunk,
    setPageAsBoundary: setPageAsBoundary,
    requestId: requestId,
    useOcr: useOcr,
    parsePdfTablesWithOcr: parsePdfTablesWithOcr
)

⚙️ Parameters

data_source_id: Int
tags: AnyCodable
chunk_size: Int
chunk_overlap: Int
skip_embedding_generation: Bool
embedding_model: EmbeddingGeneratorsNullable
generate_sparse_vectors: Bool
prepend_filename_to_chunks: Bool
max_items_per_chunk: Int

Number of objects per chunk. For csv, tsv, xlsx, and json files only.

set_page_as_boundary: Bool
request_id: String
use_ocr: Bool
parse_pdf_tables_with_ocr: Bool

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/integrations/confluence/sync POST

🔙 Back to Table of Contents


carbonai.integrations.syncDataSourceItems

Sync Data Source Items

🛠️ Usage

let dataSourceId = 987
let syncDataSourceItemsResponse = try await carbonai.integrations.syncDataSourceItems(
    dataSourceId: dataSourceId
)

⚙️ Parameters

data_source_id: Int

🔄 Return

OrganizationUserDataSourceAPI

🌐 Endpoint

/integrations/items/sync POST

🔙 Back to Table of Contents


carbonai.integrations.syncFiles

After listing files and folders via /integrations/items/sync and integrations/items/list, use the selected items' external ids as the ids in this endpoint to sync them into Carbon. Sharepoint items take an additional parameter root_id, which identifies the drive the file or folder is in and is stored in root_external_id. That additional paramter is optional and excluding it will tell the sync to assume the item is stored in the default Documents drive.

🛠️ Usage

let dataSourceId = 987
let ids = IdsProperty(
    
)
let tags = "TODO"
let chunkSize = 987
let chunkOverlap = 987
let skipEmbeddingGeneration = true
let embeddingModel = EmbeddingGeneratorsNullable(
    
)
let generateSparseVectors = true
let prependFilenameToChunks = true
let maxItemsPerChunk = 987
let setPageAsBoundary = true
let requestId = "requestId_example"
let useOcr = true
let parsePdfTablesWithOcr = true
let syncFilesResponse = try await carbonai.integrations.syncFiles(
    dataSourceId: dataSourceId,
    ids: ids,
    tags: tags,
    chunkSize: chunkSize,
    chunkOverlap: chunkOverlap,
    skipEmbeddingGeneration: skipEmbeddingGeneration,
    embeddingModel: embeddingModel,
    generateSparseVectors: generateSparseVectors,
    prependFilenameToChunks: prependFilenameToChunks,
    maxItemsPerChunk: maxItemsPerChunk,
    setPageAsBoundary: setPageAsBoundary,
    requestId: requestId,
    useOcr: useOcr,
    parsePdfTablesWithOcr: parsePdfTablesWithOcr
)

⚙️ Parameters

data_source_id: Int
tags: AnyCodable
chunk_size: Int
chunk_overlap: Int
skip_embedding_generation: Bool
embedding_model: EmbeddingGeneratorsNullable
generate_sparse_vectors: Bool
prepend_filename_to_chunks: Bool
max_items_per_chunk: Int

Number of objects per chunk. For csv, tsv, xlsx, and json files only.

set_page_as_boundary: Bool
request_id: String
use_ocr: Bool
parse_pdf_tables_with_ocr: Bool

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/integrations/files/sync POST

🔙 Back to Table of Contents


carbonai.integrations.syncGitbook

You can sync upto 20 Gitbook spaces at a time using this endpoint. Additional parameters below can be used to associate data with the synced pages or modify the sync behavior.

🛠️ Usage

let spaceIds = [
"inner_example"
]
let dataSourceId = 987
let tags = "TODO"
let chunkSize = 987
let chunkOverlap = 987
let skipEmbeddingGeneration = true
let embeddingModel = EmbeddingGenerators(
    
)
let generateSparseVectors = true
let prependFilenameToChunks = true
let requestId = "requestId_example"
let syncGitbookResponse = try await carbonai.integrations.syncGitbook(
    spaceIds: spaceIds,
    dataSourceId: dataSourceId,
    tags: tags,
    chunkSize: chunkSize,
    chunkOverlap: chunkOverlap,
    skipEmbeddingGeneration: skipEmbeddingGeneration,
    embeddingModel: embeddingModel,
    generateSparseVectors: generateSparseVectors,
    prependFilenameToChunks: prependFilenameToChunks,
    requestId: requestId
)

⚙️ Parameters

space_ids: [String]
data_source_id: Int
tags: AnyCodable
chunk_size: Int
chunk_overlap: Int
skip_embedding_generation: Bool
embedding_model: EmbeddingGenerators
generate_sparse_vectors: Bool
prepend_filename_to_chunks: Bool
request_id: String

🌐 Endpoint

/integrations/gitbook/sync POST

🔙 Back to Table of Contents


carbonai.integrations.syncGmail

Once you have successfully connected your gmail account, you can choose which emails to sync with us using the filters parameter. Filters is a JSON object with key value pairs. It also supports AND and OR operations. For now, we support a limited set of keys listed below.

label: Inbuilt Gmail labels, for example "Important" or a custom label you created.
after or before: A date in YYYY/mm/dd format (example 2023/12/31). Gets emails after/before a certain date. You can also use them in combination to get emails from a certain period.
is: Can have the following values - starred, important, snoozed, and unread

Using keys or values outside of the specified values can lead to unexpected behaviour.

An example of a basic query with filters can be

{
    "filters": {
            "key": "label",
            "value": "Test"
        }
}

Which will list all emails that have the label "Test".

You can use AND and OR operation in the following way:

{
    "filters": {
        "AND": [
            {
                "key": "after",
                "value": "2024/01/07"
            },
            {
                "OR": [
                    {
                        "key": "label",
                        "value": "Personal"
                    },
                    {
                        "key": "is",
                        "value": "starred"
                    }
                ]
            }
        ]
    }
}

This will return emails after 7th of Jan that are either starred or have the label "Personal". Note that this is the highest level of nesting we support, i.e. you can't add more AND/OR filters within the OR filter in the above example.

🛠️ Usage

let filters = "TODO"
let tags = "TODO"
let chunkSize = 987
let chunkOverlap = 987
let skipEmbeddingGeneration = true
let embeddingModel = EmbeddingGenerators(
    
)
let generateSparseVectors = true
let prependFilenameToChunks = true
let dataSourceId = 987
let requestId = "requestId_example"
let syncGmailResponse = try await carbonai.integrations.syncGmail(
    filters: filters,
    tags: tags,
    chunkSize: chunkSize,
    chunkOverlap: chunkOverlap,
    skipEmbeddingGeneration: skipEmbeddingGeneration,
    embeddingModel: embeddingModel,
    generateSparseVectors: generateSparseVectors,
    prependFilenameToChunks: prependFilenameToChunks,
    dataSourceId: dataSourceId,
    requestId: requestId
)

⚙️ Parameters

filters: AnyCodable
tags: AnyCodable
chunk_size: Int
chunk_overlap: Int
skip_embedding_generation: Bool
embedding_model: EmbeddingGenerators
generate_sparse_vectors: Bool
prepend_filename_to_chunks: Bool
data_source_id: Int
request_id: String

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/integrations/gmail/sync POST

🔙 Back to Table of Contents


carbonai.integrations.syncOutlook

Once you have successfully connected your Outlook account, you can choose which emails to sync with us using the filters and folder parameter. "folder" should be the folder you want to sync from Outlook. By default we get messages from your inbox folder.
Filters is a JSON object with key value pairs. It also supports AND and OR operations. For now, we support a limited set of keys listed below.

category: Custom categories that you created in Outlook.
after or before: A date in YYYY/mm/dd format (example 2023/12/31). Gets emails after/before a certain date. You can also use them in combination to get emails from a certain period.
is: Can have the following values: flagged

An example of a basic query with filters can be

{
    "filters": {
            "key": "category",
            "value": "Test"
        }
}

Which will list all emails that have the category "Test".

Specifying a custom folder in the same query

{
    "folder": "Folder Name",
    "filters": {
            "key": "category",
            "value": "Test"
        }
}

You can use AND and OR operation in the following way:

{
    "filters": {
        "AND": [
            {
                "key": "after",
                "value": "2024/01/07"
            },
            {
                "OR": [
                    {
                        "key": "category",
                        "value": "Personal"
                    },
                    {
                        "key": "category",
                        "value": "Test"
                    },
                ]
            }
        ]
    }
}

This will return emails after 7th of Jan that have either Personal or Test as category. Note that this is the highest level of nesting we support, i.e. you can't add more AND/OR filters within the OR filter in the above example.

🛠️ Usage

let filters = "TODO"
let tags = "TODO"
let folder = "folder_example"
let chunkSize = 987
let chunkOverlap = 987
let skipEmbeddingGeneration = true
let embeddingModel = EmbeddingGenerators(
    
)
let generateSparseVectors = true
let prependFilenameToChunks = true
let dataSourceId = 987
let requestId = "requestId_example"
let syncOutlookResponse = try await carbonai.integrations.syncOutlook(
    filters: filters,
    tags: tags,
    folder: folder,
    chunkSize: chunkSize,
    chunkOverlap: chunkOverlap,
    skipEmbeddingGeneration: skipEmbeddingGeneration,
    embeddingModel: embeddingModel,
    generateSparseVectors: generateSparseVectors,
    prependFilenameToChunks: prependFilenameToChunks,
    dataSourceId: dataSourceId,
    requestId: requestId
)

⚙️ Parameters

filters: AnyCodable
tags: AnyCodable
folder: String
chunk_size: Int
chunk_overlap: Int
skip_embedding_generation: Bool
embedding_model: EmbeddingGenerators
generate_sparse_vectors: Bool
prepend_filename_to_chunks: Bool
data_source_id: Int
request_id: String

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/integrations/outlook/sync POST

🔙 Back to Table of Contents


carbonai.integrations.syncRssFeed

Rss Feed

🛠️ Usage

let url = "url_example"
let tags = "TODO"
let chunkSize = 987
let chunkOverlap = 987
let skipEmbeddingGeneration = true
let embeddingModel = EmbeddingGenerators(
    
)
let generateSparseVectors = true
let prependFilenameToChunks = true
let requestId = "requestId_example"
let syncRssFeedResponse = try await carbonai.integrations.syncRssFeed(
    url: url,
    tags: tags,
    chunkSize: chunkSize,
    chunkOverlap: chunkOverlap,
    skipEmbeddingGeneration: skipEmbeddingGeneration,
    embeddingModel: embeddingModel,
    generateSparseVectors: generateSparseVectors,
    prependFilenameToChunks: prependFilenameToChunks,
    requestId: requestId
)

⚙️ Parameters

url: String
tags: AnyCodable
chunk_size: Int
chunk_overlap: Int
skip_embedding_generation: Bool
embedding_model: EmbeddingGenerators
generate_sparse_vectors: Bool
prepend_filename_to_chunks: Bool
request_id: String

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/integrations/rss_feed POST

🔙 Back to Table of Contents


carbonai.integrations.syncS3Files

After optionally loading the items via /integrations/items/sync and integrations/items/list, use the bucket name and object key as the ID in this endpoint to sync them into Carbon. Additional parameters below can associate data with the selected items or modify the sync behavior

🛠️ Usage

let ids = [
S3GetFileInput(
    id: "id_example",
    bucket: "bucket_example"
)
]
let tags = "TODO"
let chunkSize = 987
let chunkOverlap = 987
let skipEmbeddingGeneration = true
let embeddingModel = EmbeddingGenerators(
    
)
let generateSparseVectors = true
let prependFilenameToChunks = true
let maxItemsPerChunk = 987
let setPageAsBoundary = true
let dataSourceId = 987
let requestId = "requestId_example"
let useOcr = true
let parsePdfTablesWithOcr = true
let syncS3FilesResponse = try await carbonai.integrations.syncS3Files(
    ids: ids,
    tags: tags,
    chunkSize: chunkSize,
    chunkOverlap: chunkOverlap,
    skipEmbeddingGeneration: skipEmbeddingGeneration,
    embeddingModel: embeddingModel,
    generateSparseVectors: generateSparseVectors,
    prependFilenameToChunks: prependFilenameToChunks,
    maxItemsPerChunk: maxItemsPerChunk,
    setPageAsBoundary: setPageAsBoundary,
    dataSourceId: dataSourceId,
    requestId: requestId,
    useOcr: useOcr,
    parsePdfTablesWithOcr: parsePdfTablesWithOcr
)

⚙️ Parameters

ids: [S3GetFileInput]
tags: AnyCodable
chunk_size: Int
chunk_overlap: Int
skip_embedding_generation: Bool
embedding_model: EmbeddingGenerators
generate_sparse_vectors: Bool
prepend_filename_to_chunks: Bool
max_items_per_chunk: Int

Number of objects per chunk. For csv, tsv, xlsx, and json files only.

set_page_as_boundary: Bool
data_source_id: Int
request_id: String
use_ocr: Bool
parse_pdf_tables_with_ocr: Bool

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/integrations/s3/files POST

🔙 Back to Table of Contents


carbonai.organizations.callGet

Get Organization

🛠️ Usage

let callGetResponse = try await carbonai.organizations.callGet()

🔄 Return

OrganizationResponse

🌐 Endpoint

/organization GET

🔙 Back to Table of Contents


carbonai.users.callGet

User Endpoint

🛠️ Usage

let customerId = "customerId_example"
let callGetResponse = try await carbonai.users.callGet(
    customerId: customerId
)

⚙️ Parameters

customer_id: String

🔄 Return

UserResponse

🌐 Endpoint

/user POST

🔙 Back to Table of Contents


carbonai.users.delete

Delete Users

🛠️ Usage

let customerIds = [
"inner_example"
]
let deleteResponse = try await carbonai.users.delete(
    customerIds: customerIds
)

⚙️ Parameters

customer_ids: [String]

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/delete_users POST

🔙 Back to Table of Contents


carbonai.users.toggleUserFeatures

Toggle User Features

🛠️ Usage

let configurationKeyName = "configurationKeyName_example"
let value = "TODO"
let toggleUserFeaturesResponse = try await carbonai.users.toggleUserFeatures(
    configurationKeyName: configurationKeyName,
    value: value
)

⚙️ Parameters

configuration_key_name: String
value: AnyCodable

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/modify_user_configuration POST

🔙 Back to Table of Contents


carbonai.users.updateUsers

Update Users

🛠️ Usage

let customerIds = [
"inner_example"
]
let autoSyncEnabledSources = AutoSyncEnabledSourcesProperty(
    
)
let updateUsersResponse = try await carbonai.users.updateUsers(
    customerIds: customerIds,
    autoSyncEnabledSources: autoSyncEnabledSources
)

⚙️ Parameters

customer_ids: [String]

List of organization supplied user IDs

auto_sync_enabled_sources: AutoSyncEnabledSourcesProperty

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/update_users POST

🔙 Back to Table of Contents


carbonai.utilities.fetchUrls

Extracts all URLs from a webpage.

Args: url (str): URL of the webpage

Returns: FetchURLsResponse: A response object with a list of URLs extracted from the webpage and the webpage content.

🛠️ Usage

let url = "url_example"
let fetchUrlsResponse = try await carbonai.utilities.fetchUrls(
    url: url
)

⚙️ Parameters

url: String

🔄 Return

FetchURLsResponse

🌐 Endpoint

/fetch_urls GET

🔙 Back to Table of Contents


carbonai.utilities.fetchYoutubeTranscripts

Fetches english transcripts from YouTube videos.

Args: id (str): The ID of the YouTube video. raw (bool): Whether to return the raw transcript or not. Defaults to False.

Returns: dict: A dictionary with the transcript of the YouTube video.

🛠️ Usage

let id = "id_example"
let raw = false
let fetchYoutubeTranscriptsResponse = try await carbonai.utilities.fetchYoutubeTranscripts(
    id: id,
    raw: raw
)

⚙️ Parameters

id: String
raw: Bool

🔄 Return

YoutubeTranscriptResponse

🌐 Endpoint

/fetch_youtube_transcript GET

🔙 Back to Table of Contents


carbonai.utilities.processSitemap

Retrieves all URLs from a sitemap, which can subsequently be utilized with our web_scrape endpoint.

🛠️ Usage

let url = "url_example"
let processSitemapResponse = try await carbonai.utilities.processSitemap(
    url: url
)

⚙️ Parameters

url: String

🌐 Endpoint

/process_sitemap GET

🔙 Back to Table of Contents


carbonai.utilities.scrapeSitemap

Extracts all URLs from a sitemap and performs a web scrape on each of them.

Args: sitemap_url (str): URL of the sitemap

Returns: dict: A response object with the status of the scraping job message.-->

🛠️ Usage

let url = "url_example"
let tags = "TODO"
let maxPagesToScrape = 987
let chunkSize = 987
let chunkOverlap = 987
let skipEmbeddingGeneration = true
let enableAutoSync = true
let generateSparseVectors = true
let prependFilenameToChunks = true
let htmlTagsToSkip = [
"inner_example"
]
let cssClassesToSkip = [
"inner_example"
]
let cssSelectorsToSkip = [
"inner_example"
]
let embeddingModel = EmbeddingGenerators(
    
)
let scrapeSitemapResponse = try await carbonai.utilities.scrapeSitemap(
    url: url,
    tags: tags,
    maxPagesToScrape: maxPagesToScrape,
    chunkSize: chunkSize,
    chunkOverlap: chunkOverlap,
    skipEmbeddingGeneration: skipEmbeddingGeneration,
    enableAutoSync: enableAutoSync,
    generateSparseVectors: generateSparseVectors,
    prependFilenameToChunks: prependFilenameToChunks,
    htmlTagsToSkip: htmlTagsToSkip,
    cssClassesToSkip: cssClassesToSkip,
    cssSelectorsToSkip: cssSelectorsToSkip,
    embeddingModel: embeddingModel
)

⚙️ Parameters

url: String
tags: [String: Tags1]
max_pages_to_scrape: Int
chunk_size: Int
chunk_overlap: Int
skip_embedding_generation: Bool
enable_auto_sync: Bool
generate_sparse_vectors: Bool
prepend_filename_to_chunks: Bool
html_tags_to_skip: [String]
css_classes_to_skip: [String]
css_selectors_to_skip: [String]
embedding_model: EmbeddingGenerators

🌐 Endpoint

/scrape_sitemap POST

🔙 Back to Table of Contents


carbonai.utilities.scrapeWeb

Conduct a web scrape on a given webpage URL. Our web scraper is fully compatible with JavaScript and supports recursion depth, enabling you to efficiently extract all content from the target website.

🛠️ Usage

let scrapeWebResponse = try await carbonai.utilities.scrapeWeb(
)

⚙️ Request Body

[WebscrapeRequest]

🌐 Endpoint

/web_scrape POST

🔙 Back to Table of Contents


carbonai.utilities.searchUrls

Perform a web search and obtain a list of relevant URLs.

As an illustration, when you perform a search for “content related to MRNA,” you will receive a list of links such as the following:

- https://tomrenz.substack.com/p/mrna-and-why-it-matters

- https://www.statnews.com/2020/11/10/the-story-of-mrna-how-a-once-dismissed-idea-became-a-leading-technology-in-the-covid-vaccine-race/

- https://www.statnews.com/2022/11/16/covid-19-vaccines-were-a-success-but-mrna-still-has-a-delivery-problem/

- https://joomi.substack.com/p/were-still-being-misled-about-how

Subsequently, you can submit these links to the web_scrape endpoint in order to retrieve the content of the respective web pages.

Args: query (str): Query to search for

Returns: FetchURLsResponse: A response object with a list of URLs for a given search query.

🛠️ Usage

let query = "query_example"
let searchUrlsResponse = try await carbonai.utilities.searchUrls(
    query: query
)

⚙️ Parameters

query: String

🔄 Return

FetchURLsResponse

🌐 Endpoint

/search_urls GET

🔙 Back to Table of Contents


carbonai.webhooks.addUrl

Add Webhook Url

🛠️ Usage

let url = "url_example"
let addUrlResponse = try await carbonai.webhooks.addUrl(
    url: url
)

⚙️ Parameters

url: String

🔄 Return

Webhook

🌐 Endpoint

/add_webhook POST

🔙 Back to Table of Contents


carbonai.webhooks.deleteUrl

Delete Webhook Url

🛠️ Usage

let webhookId = 987
let deleteUrlResponse = try await carbonai.webhooks.deleteUrl(
    webhookId: webhookId
)

⚙️ Parameters

webhookId: Int

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/delete_webhook/{webhook_id} DELETE

🔙 Back to Table of Contents


carbonai.webhooks.urls

Webhook Urls

🛠️ Usage

let pagination = Pagination(
    limit: 123,
    offset: 123
)
let orderBy = WebhookOrderByColumns(
    
)
let orderDir = OrderDir(
    
)
let filters = WebhookFilters(
    ids: [
    123
    ]
)
let urlsResponse = try await carbonai.webhooks.urls(
    pagination: pagination,
    orderBy: orderBy,
    orderDir: orderDir,
    filters: filters
)

⚙️ Parameters

pagination: Pagination
order_by: WebhookOrderByColumns
order_dir: OrderDir

🔄 Return

WebhookQueryResponse

🌐 Endpoint

/webhooks POST

🔙 Back to Table of Contents


Author

This TypeScript package is automatically generated by Konfig