Documentazione / Files Module

Files Module - User Guide

Overview

The Files module provides comprehensive file management capabilities and serves as the entry point for document intelligence in the platform. When files are uploaded, they automatically undergo AI-powered analysis to extract structured information that feeds into the platform's Knowledge Base and powers other intelligent features.

The module includes both manual file management and automated folder synchronization capabilities, supporting multiple cloud providers for seamless document workflow integration.

Note: This module requires the files module to be enabled for your company. Enhanced analysis features require additional AI agents for classification and data extraction.

The File Intelligence Pipeline

๐Ÿ“‚ From Upload to Intelligence

When you upload a file to the system (either manually or through automated synchronization), it goes through an sophisticated automated analysis process:

File Upload โ†’ Content Extraction โ†’ AI Classification โ†’ Data Analysis โ†’ Knowledge Base Integration โ†’ Business Intelligence

Step 1: Content Extraction

  • System extracts text content from uploaded documents
  • Supports PDF, Word, Excel, and other common formats
  • Content becomes available for AI processing

Step 2: AI Classification

  • FileClassificator Agent analyzes document content
  • Automatically determines document type (invoice, contract, offer, report, etc.)
  • Learns from your company's document patterns over time
  • Creates or updates document type categories

Step 3: Intelligent Analysis

  • FileAnalyzer Agent extracts structured business data
  • Uses predefined schemas based on document type
  • Applies AI analysis to identify key information fields
  • Generates structured data ready for business use

Step 4: Knowledge Base Integration

  • Extracted information automatically creates structured business records
  • Data becomes searchable and filterable in your Knowledge Base
  • Information feeds into dashboards, reports, and AI insights
  • Links maintained between original file and extracted data

๐Ÿค– AI Agents in Action

FileClassificator Agent:

  • Determines document types automatically
  • Learns your company's document categories
  • Ensures consistent classification across uploads
  • Updates type definitions based on new document patterns

FileAnalyzer Agent:

  • Extracts business-relevant information from classified documents
  • Uses customizable extraction templates per document type
  • Processes content intelligently to identify key data points
  • Creates structured records ready for business analysis

Manual File Management

File Manager Layout

When you access the Files module at /files/manager, you'll see:

  • File List: Comprehensive table of all uploaded files
  • Search & Filter: Find files by name or document type
  • File Details: Metadata and processing status for each file
  • Actions: Download, reprocess, or delete files

File Information Display

Table Columns:

  • ID: Unique file identifier
  • Filename: Original file name with visual formatting
  • Type: AI-determined document category
  • Status: Processing status and extracted information availability
  • Actions: Available operations for each file

File Status Indicators:

  • Processed: File analyzed and Knowledge Base record created
  • Classified: Type determined but analysis pending
  • Pending: Upload complete, processing in progress
  • Error: Processing failed, manual intervention needed

Automated Folder Synchronization

๐Ÿ”„ Drive Integration Overview

The platform includes powerful automated synchronization capabilities that connect your cloud storage folders directly to the intelligent file processing pipeline. This means documents added to configured folders are automatically imported, analyzed, and integrated into your Knowledge Base without manual intervention.

Access: Drive synchronization is available at /drive/manager and requires the files module permission.

Supported Cloud Providers

Google Drive:

  • Full folder synchronization from shared or personal folders
  • Automatic detection of new, modified, and deleted files
  • Real-time content extraction and analysis
  • Support for all Google Drive-compatible file formats

Nextcloud:

  • WebDAV-based synchronization with Nextcloud servers
  • Secure authentication with encrypted credentials
  • Custom folder path configuration
  • Full file lifecycle management

Setting Up Folder Synchronization

Configuration Process:

  1. Access Drive Manager: Navigate to the Drive module interface
  2. Add New Sync Folder: Click "Add" to configure a new synchronized folder
  3. Choose Provider: Select Google Drive or Nextcloud
  4. Configure Access: Provide folder ID/link and authentication details
  5. Start Sync: Initiate first synchronization and set up automatic monitoring

Google Drive Setup:

  • Folder Identification: Paste folder sharing link or extract folder ID
  • Access Permissions: Ensure folder is accessible with appropriate permissions
  • Automatic Processing: System extracts folder ID from sharing links automatically
  • Visual Feedback: Interface shows original link for verification

Nextcloud Setup:

  • Server Configuration: Provide Nextcloud server URL
  • Authentication: Username and password for WebDAV access
  • WebDAV Path: Custom WebDAV root path (typically /remote.php/webdav/)
  • Folder Path: Specific folder path within Nextcloud instance

Synchronization Process

Sync Cycle Operations:

  1. Change Detection: System monitors folders for new, modified, or deleted files
  2. File Retrieval: New or updated files downloaded to platform storage
  3. Content Processing: Files undergo same analysis pipeline as manual uploads
  4. Knowledge Integration: Extracted data automatically creates business records
  5. Status Tracking: Sync history and file processing status maintained

Smart Synchronization Features:

  • Incremental Updates: Only changed files are processed during sync cycles
  • Deletion Handling: Files removed from cloud folders are cleaned up locally
  • Modification Detection: Changed files are re-analyzed and Knowledge Base updated
  • Error Recovery: Failed synchronizations are logged and can be retried
  • Background Processing: Sync operations run asynchronously without blocking interface

Drive Management Interface

Synchronized Folders Dashboard:

  • Folder List: All configured sync folders with status information
  • Last Sync Status: Timestamp and result of most recent synchronization
  • Sync Controls: Manual sync triggers and configuration management
  • Provider Information: Cloud provider type and connection details

Folder Status Indicators:

  • Available: Successfully configured and ready for synchronization
  • Syncing: Active synchronization in progress
  • Error: Sync failure requiring attention
  • Never Synced: Newly configured folder pending first sync

Management Operations:

  • Manual Sync: Trigger immediate synchronization for any folder
  • Edit Configuration: Update folder paths, credentials, or sync settings
  • Remove Sync: Disconnect folder from automatic synchronization
  • View Sync History: Track synchronization events and file processing results

File Operations

โœ… What You Can Do

File Management:

  • View All Files: Browse complete file inventory for your company (manual and synced)
  • Search Files: Find files by filename using real-time search
  • Filter by Type: Show only specific document types
  • Filter by Source: Distinguish between manually uploaded and synchronized files
  • Download Files: Access original uploaded documents
  • Delete Files: Remove files and associated data

Processing Control:

  • Reprocess Files: Trigger re-analysis with updated settings
  • View Metadata: Examine file details and extracted information
  • Monitor Status: Track processing progress and results
  • Edit Metadata: Update file information manually when needed

Synchronization Management:

  • Configure Sync Folders: Set up automatic synchronization with cloud providers
  • Manual Sync Triggers: Force immediate synchronization when needed
  • Sync Status Monitoring: Track synchronization progress and history
  • Provider Management: Configure multiple cloud storage connections

Information Access:

  • View Extracted Data: See structured information in detailed metadata view
  • JSON Export: Access raw extracted data in structured format
  • Link to Knowledge Base: Navigate to business records created from files
  • Cross-Reference: Connect file data with dashboard and chat insights

๐Ÿ”„ Automated Processing Workflow

Upload Triggers:

  1. Manual Upload: Document uploaded through any interface (Chat, Files module, direct upload)
  2. Automatic Sync: Files added to synchronized cloud folders
  3. Content Extraction: Text and data extracted from file format
  4. Type Classification: AI determines document category automatically
  5. Schema Application: Appropriate data extraction template applied
  6. Information Analysis: Structured data extracted using AI analysis
  7. Knowledge Base Integration: Business record created with structured data
  8. Platform Integration: Data becomes available across all platform modules

Sync-Specific Processing:

  • Parallel Processing: Multiple sync folders processed simultaneously
  • Change Monitoring: Continuous monitoring of configured cloud folders
  • Incremental Updates: Only modified files trigger reprocessing
  • Dependency Management: Dashboard widgets updated when relevant sync data changes
  • Event Triggers: Workflow automation activated based on synchronized file content

๐Ÿ“Š Integration with Knowledge Base

Automatic Data Creation: Every successfully processed file (manual or synchronized) creates corresponding Knowledge Base records:

  • Source Tracking: Business records linked to original files and sync sources
  • Type Mapping: Document types become Knowledge Base categories
  • Data Structure: Extracted information stored as structured JSON data
  • Metadata Preservation: File details and sync information maintained alongside extracted data

Knowledge Enhancement:

  • Searchable Content: File content becomes searchable through Knowledge Base queries
  • Filtered Access: Knowledge Base filtering includes file-derived data from all sources
  • Dashboard Integration: File data powers dashboard widgets and analytics
  • Chat Intelligence: AI assistant references file-extracted information from all sources

โŒ What You Cannot Do

Access Restrictions:

  • Cannot view or manage files from other companies
  • Cannot access files uploaded by other users (unless shared)
  • Cannot modify file content after upload or sync
  • Cannot change file processing history or original metadata
  • Cannot access cloud provider credentials stored by other users

Processing Limitations:

  • Cannot force processing of unsupported file formats
  • Cannot modify AI classification results directly (must reprocess)
  • Cannot merge multiple files into single Knowledge Base records
  • Cannot schedule automated processing (processes immediately on upload/sync)

Synchronization Constraints:

  • Cannot sync folders without proper cloud provider permissions
  • Cannot modify sync history or override change detection
  • Cannot force sync of files that exceed system size limits
  • Cannot bypass cloud provider rate limits or API restrictions

Data Constraints:

  • Cannot export bulk file data directly
  • Cannot modify extraction schemas without administrator access
  • Cannot revert Knowledge Base record creation once files are processed
  • Cannot process files that exceed system size limits

Advanced Features

๐Ÿ”ง Metadata Management

File Metadata Views:

  • Friendly View: Human-readable display of file information and extracted data
  • JSON View: Raw structured data for technical analysis
  • Editable Fields: Certain metadata can be updated manually
  • Processing History: Track of analysis and reprocessing events
  • Sync Information: Cloud provider source and synchronization details

Enhanced Metadata Structure:

{
  "type": "invoice",
  "processing_status": "completed",
  "source": "google_drive_sync",
  "drive_folder_id": "1XYZ...",
  "sync_timestamp": "2024-01-15T10:30:00Z",
  "extracted_data": {
    "customer_name": "Acme Corp",
    "amount": 1500.00,
    "date": "2024-01-15",
    "status": "paid"
  },
  "classification_confidence": 0.95
}

๐ŸŽฏ Document Type Management

Dynamic Type Learning:

  • System learns new document types automatically from all sources
  • Classification improves with examples from manual uploads and sync folders
  • Custom extraction templates configurable per type
  • Business-specific document categories supported across all input methods

Extraction Configuration:

  • Administrators can configure what data to extract per document type
  • Extraction schemas adapt to business needs and apply to all file sources
  • Field mappings customizable for specific document formats
  • Quality validation ensures extraction accuracy regardless of source

๐Ÿ“ˆ Performance Optimization

Smart Processing:

  • Caching: Processed results cached for fast access across all sources
  • Dependency Tracking: Only relevant dashboards updated when files change
  • Batch Processing: Multiple files and sync operations processed efficiently
  • Resource Management: Processing optimized for system performance

Quality Assurance:

  • Confidence Scoring: AI classification includes confidence levels for all sources
  • Validation Checks: Extracted data validated for consistency
  • Error Reporting: Processing failures clearly identified with source information
  • Manual Override: Manual correction capabilities for edge cases

Sync Optimization:

  • Change Detection: Only modified files trigger reprocessing during sync
  • Error Recovery: Failed sync operations can be retried automatically
  • Rate Limiting: Respect cloud provider API limits and usage quotas
  • Background Operations: Sync processes run without blocking user interface

Integration Benefits

๐Ÿ”— Cross-Module Intelligence

Dashboard Module:

  • File-extracted data automatically available in Knowledge Base widgets
  • Document analysis powers business intelligence dashboards (from all sources)
  • Real-time updates when files are reprocessed or re-synced
  • Filtered views based on document types and extracted information

Chat Module:

  • AI assistant references file content and extracted data from all sources
  • Natural language queries about document information
  • File upload and analysis directly within chat interface
  • Context-aware responses based on document intelligence from manual and synced files

Knowledge Base Module:

  • Every processed file creates structured business records
  • File data becomes searchable through Knowledge Base interface
  • Advanced filtering and analysis capabilities
  • Direct link between files and extracted business data

Workflow Module:

  • File processing triggers automated workflows
  • Business processes activated based on document content from any source
  • Event-driven automation using extracted data
  • Integration with external systems based on file analysis

๐Ÿง  AI Agent Coordination

Intelligence Scaling: Your file processing capabilities depend on available AI agents:

  • Classification Agents: More accurate document type detection
  • Analysis Agents: Enhanced data extraction capabilities
  • Context Agents: Better integration with existing company data
  • Validation Agents: Improved data quality and accuracy

๐ŸŒ Cloud Provider Benefits

Google Drive Integration:

  • Seamless Workflow: Documents shared to folders automatically become part of your business intelligence
  • Team Collaboration: Multiple users can add documents to sync folders
  • Real-time Processing: New documents processed as soon as they appear in Drive
  • Permission Inheritance: Sync respects Drive sharing and permission settings

Nextcloud Integration:

  • Self-hosted Control: Keep document processing within your infrastructure
  • Custom Configuration: Flexible WebDAV configuration for specific setups
  • Enterprise Security: Integration with enterprise Nextcloud deployments
  • Private Cloud Benefits: Full control over document storage and processing pipeline

Maximizing File Intelligence: Files are the foundation of your platform's intelligence, whether uploaded manually or synchronized automatically. The more documents you process from various sources, the smarter your AI becomes, and the more valuable insights you gain across all modules. Work with your administrator to optimize extraction schemas and configure strategic sync folders for your business documents.