Files Module - User Guide
Overview
The Files module provides comprehensive file management capabilities and serves as the entry point for document intelligence in the platform. When files are uploaded, they automatically undergo AI-powered analysis to extract structured information that feeds into the platform's Knowledge Base and powers other intelligent features.
The module includes both manual file management and automated folder synchronization capabilities, supporting multiple cloud providers for seamless document workflow integration.
Note: This module requires the
files
module to be enabled for your company. Enhanced analysis features require additional AI agents for classification and data extraction.
The File Intelligence Pipeline
๐ From Upload to Intelligence
When you upload a file to the system (either manually or through automated synchronization), it goes through an sophisticated automated analysis process:
File Upload โ Content Extraction โ AI Classification โ Data Analysis โ Knowledge Base Integration โ Business Intelligence
Step 1: Content Extraction
- System extracts text content from uploaded documents
- Supports PDF, Word, Excel, and other common formats
- Content becomes available for AI processing
Step 2: AI Classification
- FileClassificator Agent analyzes document content
- Automatically determines document type (invoice, contract, offer, report, etc.)
- Learns from your company's document patterns over time
- Creates or updates document type categories
Step 3: Intelligent Analysis
- FileAnalyzer Agent extracts structured business data
- Uses predefined schemas based on document type
- Applies AI analysis to identify key information fields
- Generates structured data ready for business use
Step 4: Knowledge Base Integration
- Extracted information automatically creates structured business records
- Data becomes searchable and filterable in your Knowledge Base
- Information feeds into dashboards, reports, and AI insights
- Links maintained between original file and extracted data
๐ค AI Agents in Action
FileClassificator Agent:
- Determines document types automatically
- Learns your company's document categories
- Ensures consistent classification across uploads
- Updates type definitions based on new document patterns
FileAnalyzer Agent:
- Extracts business-relevant information from classified documents
- Uses customizable extraction templates per document type
- Processes content intelligently to identify key data points
- Creates structured records ready for business analysis
Manual File Management
File Manager Layout
When you access the Files module at /files/manager
, you'll see:
- File List: Comprehensive table of all uploaded files
- Search & Filter: Find files by name or document type
- File Details: Metadata and processing status for each file
- Actions: Download, reprocess, or delete files
File Information Display
Table Columns:
- ID: Unique file identifier
- Filename: Original file name with visual formatting
- Type: AI-determined document category
- Status: Processing status and extracted information availability
- Actions: Available operations for each file
File Status Indicators:
- Processed: File analyzed and Knowledge Base record created
- Classified: Type determined but analysis pending
- Pending: Upload complete, processing in progress
- Error: Processing failed, manual intervention needed
Automated Folder Synchronization
๐ Drive Integration Overview
The platform includes powerful automated synchronization capabilities that connect your cloud storage folders directly to the intelligent file processing pipeline. This means documents added to configured folders are automatically imported, analyzed, and integrated into your Knowledge Base without manual intervention.
Access: Drive synchronization is available at
/drive/manager
and requires thefiles
module permission.
Supported Cloud Providers
Google Drive:
- Full folder synchronization from shared or personal folders
- Automatic detection of new, modified, and deleted files
- Real-time content extraction and analysis
- Support for all Google Drive-compatible file formats
Nextcloud:
- WebDAV-based synchronization with Nextcloud servers
- Secure authentication with encrypted credentials
- Custom folder path configuration
- Full file lifecycle management
Setting Up Folder Synchronization
Configuration Process:
- Access Drive Manager: Navigate to the Drive module interface
- Add New Sync Folder: Click "Add" to configure a new synchronized folder
- Choose Provider: Select Google Drive or Nextcloud
- Configure Access: Provide folder ID/link and authentication details
- Start Sync: Initiate first synchronization and set up automatic monitoring
Google Drive Setup:
- Folder Identification: Paste folder sharing link or extract folder ID
- Access Permissions: Ensure folder is accessible with appropriate permissions
- Automatic Processing: System extracts folder ID from sharing links automatically
- Visual Feedback: Interface shows original link for verification
Nextcloud Setup:
- Server Configuration: Provide Nextcloud server URL
- Authentication: Username and password for WebDAV access
- WebDAV Path: Custom WebDAV root path (typically
/remote.php/webdav/
) - Folder Path: Specific folder path within Nextcloud instance
Synchronization Process
Sync Cycle Operations:
- Change Detection: System monitors folders for new, modified, or deleted files
- File Retrieval: New or updated files downloaded to platform storage
- Content Processing: Files undergo same analysis pipeline as manual uploads
- Knowledge Integration: Extracted data automatically creates business records
- Status Tracking: Sync history and file processing status maintained
Smart Synchronization Features:
- Incremental Updates: Only changed files are processed during sync cycles
- Deletion Handling: Files removed from cloud folders are cleaned up locally
- Modification Detection: Changed files are re-analyzed and Knowledge Base updated
- Error Recovery: Failed synchronizations are logged and can be retried
- Background Processing: Sync operations run asynchronously without blocking interface
Drive Management Interface
Synchronized Folders Dashboard:
- Folder List: All configured sync folders with status information
- Last Sync Status: Timestamp and result of most recent synchronization
- Sync Controls: Manual sync triggers and configuration management
- Provider Information: Cloud provider type and connection details
Folder Status Indicators:
- Available: Successfully configured and ready for synchronization
- Syncing: Active synchronization in progress
- Error: Sync failure requiring attention
- Never Synced: Newly configured folder pending first sync
Management Operations:
- Manual Sync: Trigger immediate synchronization for any folder
- Edit Configuration: Update folder paths, credentials, or sync settings
- Remove Sync: Disconnect folder from automatic synchronization
- View Sync History: Track synchronization events and file processing results
File Operations
โ What You Can Do
File Management:
- View All Files: Browse complete file inventory for your company (manual and synced)
- Search Files: Find files by filename using real-time search
- Filter by Type: Show only specific document types
- Filter by Source: Distinguish between manually uploaded and synchronized files
- Download Files: Access original uploaded documents
- Delete Files: Remove files and associated data
Processing Control:
- Reprocess Files: Trigger re-analysis with updated settings
- View Metadata: Examine file details and extracted information
- Monitor Status: Track processing progress and results
- Edit Metadata: Update file information manually when needed
Synchronization Management:
- Configure Sync Folders: Set up automatic synchronization with cloud providers
- Manual Sync Triggers: Force immediate synchronization when needed
- Sync Status Monitoring: Track synchronization progress and history
- Provider Management: Configure multiple cloud storage connections
Information Access:
- View Extracted Data: See structured information in detailed metadata view
- JSON Export: Access raw extracted data in structured format
- Link to Knowledge Base: Navigate to business records created from files
- Cross-Reference: Connect file data with dashboard and chat insights
๐ Automated Processing Workflow
Upload Triggers:
- Manual Upload: Document uploaded through any interface (Chat, Files module, direct upload)
- Automatic Sync: Files added to synchronized cloud folders
- Content Extraction: Text and data extracted from file format
- Type Classification: AI determines document category automatically
- Schema Application: Appropriate data extraction template applied
- Information Analysis: Structured data extracted using AI analysis
- Knowledge Base Integration: Business record created with structured data
- Platform Integration: Data becomes available across all platform modules
Sync-Specific Processing:
- Parallel Processing: Multiple sync folders processed simultaneously
- Change Monitoring: Continuous monitoring of configured cloud folders
- Incremental Updates: Only modified files trigger reprocessing
- Dependency Management: Dashboard widgets updated when relevant sync data changes
- Event Triggers: Workflow automation activated based on synchronized file content
๐ Integration with Knowledge Base
Automatic Data Creation: Every successfully processed file (manual or synchronized) creates corresponding Knowledge Base records:
- Source Tracking: Business records linked to original files and sync sources
- Type Mapping: Document types become Knowledge Base categories
- Data Structure: Extracted information stored as structured JSON data
- Metadata Preservation: File details and sync information maintained alongside extracted data
Knowledge Enhancement:
- Searchable Content: File content becomes searchable through Knowledge Base queries
- Filtered Access: Knowledge Base filtering includes file-derived data from all sources
- Dashboard Integration: File data powers dashboard widgets and analytics
- Chat Intelligence: AI assistant references file-extracted information from all sources
โ What You Cannot Do
Access Restrictions:
- Cannot view or manage files from other companies
- Cannot access files uploaded by other users (unless shared)
- Cannot modify file content after upload or sync
- Cannot change file processing history or original metadata
- Cannot access cloud provider credentials stored by other users
Processing Limitations:
- Cannot force processing of unsupported file formats
- Cannot modify AI classification results directly (must reprocess)
- Cannot merge multiple files into single Knowledge Base records
- Cannot schedule automated processing (processes immediately on upload/sync)
Synchronization Constraints:
- Cannot sync folders without proper cloud provider permissions
- Cannot modify sync history or override change detection
- Cannot force sync of files that exceed system size limits
- Cannot bypass cloud provider rate limits or API restrictions
Data Constraints:
- Cannot export bulk file data directly
- Cannot modify extraction schemas without administrator access
- Cannot revert Knowledge Base record creation once files are processed
- Cannot process files that exceed system size limits
Advanced Features
๐ง Metadata Management
File Metadata Views:
- Friendly View: Human-readable display of file information and extracted data
- JSON View: Raw structured data for technical analysis
- Editable Fields: Certain metadata can be updated manually
- Processing History: Track of analysis and reprocessing events
- Sync Information: Cloud provider source and synchronization details
Enhanced Metadata Structure:
{
"type": "invoice",
"processing_status": "completed",
"source": "google_drive_sync",
"drive_folder_id": "1XYZ...",
"sync_timestamp": "2024-01-15T10:30:00Z",
"extracted_data": {
"customer_name": "Acme Corp",
"amount": 1500.00,
"date": "2024-01-15",
"status": "paid"
},
"classification_confidence": 0.95
}
๐ฏ Document Type Management
Dynamic Type Learning:
- System learns new document types automatically from all sources
- Classification improves with examples from manual uploads and sync folders
- Custom extraction templates configurable per type
- Business-specific document categories supported across all input methods
Extraction Configuration:
- Administrators can configure what data to extract per document type
- Extraction schemas adapt to business needs and apply to all file sources
- Field mappings customizable for specific document formats
- Quality validation ensures extraction accuracy regardless of source
๐ Performance Optimization
Smart Processing:
- Caching: Processed results cached for fast access across all sources
- Dependency Tracking: Only relevant dashboards updated when files change
- Batch Processing: Multiple files and sync operations processed efficiently
- Resource Management: Processing optimized for system performance
Quality Assurance:
- Confidence Scoring: AI classification includes confidence levels for all sources
- Validation Checks: Extracted data validated for consistency
- Error Reporting: Processing failures clearly identified with source information
- Manual Override: Manual correction capabilities for edge cases
Sync Optimization:
- Change Detection: Only modified files trigger reprocessing during sync
- Error Recovery: Failed sync operations can be retried automatically
- Rate Limiting: Respect cloud provider API limits and usage quotas
- Background Operations: Sync processes run without blocking user interface
Integration Benefits
๐ Cross-Module Intelligence
Dashboard Module:
- File-extracted data automatically available in Knowledge Base widgets
- Document analysis powers business intelligence dashboards (from all sources)
- Real-time updates when files are reprocessed or re-synced
- Filtered views based on document types and extracted information
Chat Module:
- AI assistant references file content and extracted data from all sources
- Natural language queries about document information
- File upload and analysis directly within chat interface
- Context-aware responses based on document intelligence from manual and synced files
Knowledge Base Module:
- Every processed file creates structured business records
- File data becomes searchable through Knowledge Base interface
- Advanced filtering and analysis capabilities
- Direct link between files and extracted business data
Workflow Module:
- File processing triggers automated workflows
- Business processes activated based on document content from any source
- Event-driven automation using extracted data
- Integration with external systems based on file analysis
๐ง AI Agent Coordination
Intelligence Scaling: Your file processing capabilities depend on available AI agents:
- Classification Agents: More accurate document type detection
- Analysis Agents: Enhanced data extraction capabilities
- Context Agents: Better integration with existing company data
- Validation Agents: Improved data quality and accuracy
๐ Cloud Provider Benefits
Google Drive Integration:
- Seamless Workflow: Documents shared to folders automatically become part of your business intelligence
- Team Collaboration: Multiple users can add documents to sync folders
- Real-time Processing: New documents processed as soon as they appear in Drive
- Permission Inheritance: Sync respects Drive sharing and permission settings
Nextcloud Integration:
- Self-hosted Control: Keep document processing within your infrastructure
- Custom Configuration: Flexible WebDAV configuration for specific setups
- Enterprise Security: Integration with enterprise Nextcloud deployments
- Private Cloud Benefits: Full control over document storage and processing pipeline
Maximizing File Intelligence: Files are the foundation of your platform's intelligence, whether uploaded manually or synchronized automatically. The more documents you process from various sources, the smarter your AI becomes, and the more valuable insights you gain across all modules. Work with your administrator to optimize extraction schemas and configure strategic sync folders for your business documents.