nav[aria-label="Primary Navigation"] { padding: 0; & ul { list-style: none; width: 100%; display: flex; flex-direction: row; justify-content: start; align-items: start; gap: 30px; padding: 0; & li { margin: 0; } & ul li { list-style: none; } } }

Sentiment Analysis Implementation in Vanilla Forums

How Sentiment Analysis is Calculated in Vanilla Forums

Overview

Vanilla Forums uses advanced AI-powered sentiment analysis to automatically evaluate the emotional tone of user-generated content. This system helps community managers understand the overall health of conversations and identify potentially problematic content before it impacts the community.

Technical Architecture

Core Components

The sentiment analysis system is built around several key components:

OpenAI Integration: Uses GPT-4o Mini model for AI-powered sentiment analysis
Sentiment Score Scale: 0-100 numeric scale with predefined ranges
Keyword Tracking: Identifies and scores specific terms within content
Multi-document Processing: Handles long posts by splitting and aggregating results

Sentiment Score Ranges

The system categorizes sentiment into five distinct ranges:

Strongly Negative (0-20): Highly negative content
Negative (21-40): Generally negative sentiment
Balanced (41-60): Neutral or mixed sentiment
Positive (61-80): Generally positive sentiment
Strongly Positive (81-100): Highly positive content

Post-Level Sentiment Calculation

Processing Flow

User Consent Check: System first verifies the user has opted into sentiment analysis via cookie preferences
Content Preparation: Post content (title + body for discussions, body for comments) is converted to plain text
Document Normalization: Large posts exceeding 5,000 characters are intelligently split into smaller chunks
AI Analysis: Each chunk is sent to OpenAI's GPT-4o Mini with a specialized sentiment analysis prompt
Result Aggregation: Multiple chunks are combined using weighted averaging

Code Example

Here's how multi-document sentiment aggregation works:

protected function aggregateMultiDocumentsSentiment(array $documents): array
{
    $globalSentiment = 0;
    $terms = [];
    foreach ($documents as $document) {
        $globalSentiment += $document["globalSentiment"];
        foreach ($document["terms"] as $term) {
            if (isset($terms[$term["term"]])) {
                $terms[$term["term"]]["sentiment"] += $term["sentiment"];
                $terms[$term["term"]]["occurrences"] += $term["occurrences"];
                $terms[$term["term"]]["divisor"] += 1;
            } else {
                $terms[$term["term"]] = $term;
                $terms[$term["term"]]["divisor"] = 1;
            }
        }
    }

    $globalSentiment = $globalSentiment / count($documents);
    foreach ($terms as $key => $term) {
        $terms[$key]["sentiment"] = ceil($term["sentiment"] / $term["divisor"]);
        unset($terms[$key]["divisor"]);
    }

    return ["globalSentiment" => $globalSentiment, "terms" => $terms];
}

Storage and Tracking

Sentiment data is stored in multiple locations:

Post Records: Global sentiment score stored in the main Discussion/Comment Sentiment field
Attributes Column: Serialized sentiment data in post Attributes
Keyword Sentiment Table: Detailed keyword-level sentiment tracking linked to users

Individual User Sentiment

Aggregation Method

Individual user sentiment is calculated by tracking all posts created by a specific user through the recordUserID field. The system maintains a historical record that includes:

Post-level scores: Each discussion and comment sentiment linked to the user
Keyword associations: Specific terms and their sentiment scores from user content
Temporal patterns: Sentiment trends over time for behavioral analysis

User Privacy and Consent

The system respects user privacy by:

Opt-in Required: Only processes sentiment for users who have accepted sentiment analysis cookies
Transparent Processing: Users are informed about sentiment analysis in privacy policies
Data Control: Users can opt out, stopping future sentiment processing

Content Policy Integration

OpenAI Content Filtering

When content violates OpenAI's content policies, the system assigns special negative sentiment codes:

Hate Speech (-1): Content flagged for hate speech
Jailbreak Attempts (-2): Attempts to circumvent AI safety measures
Self-harm Content (-3): Content promoting self-harm
Sexual Content (-4): Inappropriate sexual material
Violence (-5): Content promoting violence

Error Handling

The system gracefully handles various error conditions:

API Failures: Logs errors and continues operation without sentiment data
Content Policy Violations: Assigns appropriate negative sentiment codes
Processing Errors: Falls back to no sentiment rather than incorrect data

Integration Points

Automation Rules

Sentiment scores can trigger automated community management actions:

Escalation Creation: Automatically escalate highly negative posts
Moderation Queues: Route content based on sentiment thresholds
User Notifications: Alert moderators to sentiment pattern changes

Analytics and Reporting

Sentiment data feeds into various analytics systems:

Community Health Dashboards: Overall sentiment trending
User Behavior Analysis: Individual user sentiment patterns
Content Performance: Correlation between sentiment and engagement

API Integration

Accessing Sentiment Data

Sentiment scores are accessible through Vanilla's API endpoints:

Discussion API: /api/v2/discussions/{id} includes sentiment field
Comment API: /api/v2/comments/{id} includes sentiment field
Keyword Sentiment API: Access detailed keyword-level sentiment data
User Sentiment Aggregation: Query historical user sentiment patterns

Webhook Integration

Sentiment events can trigger webhooks for external system integration:

Post Sentiment Events: Fired when new content is analyzed
Threshold Alerts: Triggered when sentiment crosses configured thresholds
User Pattern Changes: Notifications for significant user sentiment shifts

Best Practices for Implementation

Configuration Recommendations

Threshold Setting: Establish clear sentiment thresholds for different automation actions
Keyword Tracking: Configure relevant keywords for your community's domain
User Communication: Clearly explain sentiment analysis in privacy policies
Moderation Training: Train moderators on interpreting sentiment data

Performance Considerations

Batch Processing: Large content is automatically chunked for optimal API usage
Rate Limiting: Built-in protections prevent API quota exhaustion
Caching: Sentiment scores are cached to avoid reprocessing unchanged content
Async Processing: Sentiment analysis runs asynchronously to avoid blocking user interactions

Development Guidelines

Event Handlers: Implement custom event handlers for sentiment-based automation
Database Schema: Understand the sentiment data storage structure for custom queries
Plugin Architecture: Extend sentiment analysis through Vanilla's plugin system
Testing: Use sentiment model test suites for validation during development

This technical documentation covers the implementation details of sentiment analysis in Vanilla Forums. For API reference documentation, consult the OpenAPI specifications. For implementation support, contact the development team.