Blog

  • cdc-pubsub

    CockroachDB CDC to Google Pub/Sub Bridge

    CockroachDB as of v22.1 natively supports sending a changefeed to Google Pub/Sub. This repository is now archived, but will be retained for demonstration purposes.

    This application demonstrates an approach to connecting a CockroachDB
    Enterprise Change Data
    Capture

    (CDC) feed into Google’s
    Pub/Sub
    service,
    until such time as CockroachDB
    natively supports
    Google Pub/Sub in a future release.

    This uses the experimental HTTP(S) backend to deliver JSON-formatted
    payloads to a topic.

    Getting Started

    • Create a GCP service account and download its JSON credentials file.
    • Grant the service account Pub/Sub Editor to automatically create a
      topic, or Pub/Sub Publisher if you wish to manually create the topic.
    • Move the JSON credentials file into a working directory $HOME/cdc-pubsub/cdc-pubsub.json
    • Start the bridge server:
      • docker run --rm -it -v $HOME/cdc-pubsub:/data:ro -p 13013:13013 bobvawter/cdc-pubsub:latest --projectID my-project-id --sharedKey xyzzy
    • Create an enterprise changefeed in CockroachDB:
      • SET CLUSTER STETING kv.rangefeed.enabled = true; if you haven’t previously enabled rangefeeds for your cluster.
      • CREATE CHANGEFEED FOR TABLE foo INTO 'experimental-http://127.0.0.1:13013/v1/my-topic?sharedKey=xyzzy' WITH updated;
      • Replace my-topic with your preferred topic name.
    • Check the log for progress.

    Flags

          --bindAddr string        the address to bind to (default ":13013")
          --credentials string     a JSON-formatted Google Cloud credentials file (default
                                   "cdc-pubsub.json")
          --dumpOnly               if true, log payloads instead of sending to pub/sub
          --gracePeriod duration   shutdown grace period (default 30s)
      -h, --help                   display this message
          --projectID string       the Google Cloud project ID
          --sharedKey strings      require clients to provide one of these secret values
          --topicPrefix string     a prefix to add to topic names
    

    Pub/Sub Attributes

    Each Pub/Sub message will be labelled with the following attributes.

    • table: The affected SQL table.
    • path: The complete path used to post the message.

    Building

    docker build . -t cdc-pubsub

    Other endpoints

    If the bridge is to be placed behind a load-balancer (e.g. in a
    Kubernetes environment), there is a /healthz endpoint which always
    returns OK.

    Runtime profiling information is available at /debug/pprof

    Security implications

    The bridge server provides the option of shared key which is provided by
    the CDC feed via the sharedKey query parameter. This key prevents
    users from inadvertently “crossing the streams” as opposed to being a
    proper security mechanism:

    • Any HTTP client with this shared key can effectively post arbitrary
      messages to any Pub/Sub topic that the bridge’s service account has
      access to.
    • Any SQL user that can execute the SHOW JOBS command can view the shared key.
    • Any user that can view the Jobs page in the Admin UI can view the shared key.
    • The shared key will likely appear unobfuscated in CockroachDB logs.

    Seamless rotation of shared keys is possible by passing multiple
    --sharedKey arguments to the bridge server.

    Google Cloud IAM restrictions can be added to the role account to limit
    the names of the Pub/Sub topics that it may access.

    Deployment strategy

    Given the lightweight nature of the bridge server and the above security
    limitations, users should deploy this server as a “sidecar” alongside
    each of their CockroachDB nodes, bound only to a loopback IP address via
    the --bindAddr flag.

    If the bridge is to be deployed as a traditional network service, it
    should be placed behind a TLS loadbalancer with appropriate firewall
    rules.

    Visit original content creator repository
    https://github.com/bobvawter/cdc-pubsub

  • Decentralized-Voting-System-Using-Ethereum-Blockchain

    Decentralized-Voting-System-Using-Ethereum-Blockchain

    The Decentralized Voting System using Ethereum Blockchain is a secure and transparent solution for conducting elections. Leveraging Ethereum’s blockchain technology, this system ensures tamper-proof voting records, enabling users to cast their votes remotely while maintaining anonymity and preventing fraud.


    Table of Contents


    Features

    • JWT for secure voter authentication and authorization.
    • Ethereum blockchain for tamper-proof and transparent voting records.
    • Removes the need for intermediaries, ensuring a trustless voting process.
    • Admin panel to manage candidates, set voting dates, and monitor results.
    • Intuitive UI for voters to cast votes and view candidate information.

    Screenshots

    Admin Page

    Admin Page

    Voting Page

    Voting Page

    Login Page

    Login Page


    Requirements

    • Node.js (version 18.14.0)
    • Metamask
    • Python (version 3.9)
    • FastAPI
    • MySQL Database (port 3306)

    Installation

    1. Clone the repository:

      git clone https://github.com/akanksha509/Decentralized-Voting-System-Using-Ethereum-Blockchain.git
    2. Download and install Ganache.

    3. Create a workspace named development in Ganache, then add truffle-config.js in the Truffle projects section by clicking ADD PROJECT.

    4. Install Metamask in your browser and import the Ganache accounts into Metamask.

    5. Add a network to Metamask:

    6. Create a MySQL database named voter_db (avoid using XAMPP). Inside this database, create a table voters:

      CREATE TABLE voters (
          voter_id VARCHAR(36) PRIMARY KEY NOT NULL,
          role ENUM('admin', 'user') NOT NULL,
          password VARCHAR(255) NOT NULL
      );
    7. Install Truffle globally:

      npm install -g truffle
    8. Install Node.js dependencies (in the project folder):

      npm install
    9. Install Python dependencies:

      pip install fastapi mysql-connector-python pydantic python-dotenv uvicorn uvicorn[standard] PyJWT

    Usage

    Note: Update the database credentials in ./Database_API/.env with your MySQL username, password, etc.

    1. Open Ganache and select the development workspace.

    2. Open a terminal in the project directory and enter the Truffle console:

      truffle console
    3. Compile the smart contracts:

      compile

      Then exit the console by typing .exit or pressing Ctrl + C.

    4. Bundle app.js with Browserify:

      browserify ./src/js/app.js -o ./src/dist/app.bundle.js
    5. Start the Node.js server:

      node index.js
    6. Open another terminal, navigate to the Database_API folder:

      cd Database_API
    7. Start the FastAPI server:

      uvicorn main:app --reload --host 127.0.0.1
    8. In a new terminal, migrate the Truffle contract to the local blockchain:

      truffle migrate
    9. Access the Voting app at http://localhost:8080/.


    Code Structure

    blockchain-voting-dapp/
    ├── build/
    │   └── contracts/
    │       ├── Migrations.json
    │       └── Voting.json
    ├── contracts/
    │   ├── Migrations.sol
    │   └── Voting.sol
    ├── Database_API/
    │   └── main.py
    ├── migrations/
    │   └── 1_initial_migration.js
    ├── node_modules/
    ├── public/
    │   └── favicon.ico
    ├── src/
    │   ├── assets/
    │   │   └── eth5.jpg
    │   ├── css/
    │   │   ├── admin.css
    │   │   ├── index.css
    │   │   └── login.css
    │   ├── dist/
    │   │   ├── app.bundle.js
    │   │   └── login.bundle.js
    │   ├── html/
    │   │   ├── admin.html
    │   │   ├── index.html
    │   │   └── login.html
    │   └── js/
    │       ├── app.js
    │       └── login.js
    ├── index.js
    ├── package.json
    ├── package-lock.json
    ├── truffle-config.js
    └── README.md
    

    License

    This project is licensed under the MIT License.


    Star the Project

    ⭐ If you like this project, please give it a star!

    Visit original content creator repository https://github.com/akanksha509/Decentralized-Voting-System-Using-Ethereum-Blockchain
  • springboot-3-micro-service-demo

    Spring boot Micro Services

    Microservices sample project

    alt text

    This repository contains a demo project showcasing a microservices-based application, designed to provide a hands-on understanding of microservices architecture and implementation. The project consists of an API Gateway, Config Server, Discovery Server, and two microservices: Student and School.

    Table of Contents

    Getting Started

    Follow the instructions below to set up the project on your local machine for development and testing purposes.

    Prerequisites

    Ensure you have the following software installed on your system before proceeding:

    • Java Development Kit (JDK) 17 or later
    • Maven
    • Docker (optional, for containerization)

    Installation

    1. Clone the repository:

    git clone git remote add origin git@github.com:khalil-bouali/springboot-3-micro-service-demo.git

    1. Navigate to the project directory:
    2. Build and package each component with Maven:

    Project Components

    API Gateway

    The API Gateway serves as the single entry point for all client requests, managing and routing them to the appropriate microservices.

    Config Server

    The Config Server centralizes configuration management for all microservices, simplifying application maintenance and consistency across environments.

    Discovery Server

    The Discovery Server provides service registration and discovery, enabling seamless service-to-service communication within the microservices ecosystem.

    Student Microservice

    The Student Microservice is responsible for managing student-related data and operations, such as adding, updating, and retrieving student records.

    School Microservice

    The School Microservice manages school-related data and operations, including adding, updating, and retrieving school records.

    Inter-Service Communication

    Using OpenFeign

    This project demonstrates inter-service communication using OpenFeign, a declarative REST client that simplifies service-to-service communication within the microservices ecosystem.

    Distributed Tracing

    Using Zipkin

    The project showcases the use of Zipkin for distributed tracing, enhancing application observability and enabling the visualization and troubleshooting of latency issues.

    Contributing

    Contributions are welcome! Please read our CONTRIBUTING.md for details on how to contribute to this project.

    License

    This project is licensed under the MIT License.

    Contact

    [Khalil Bouali] – [khalil.bouali95@gmail.com]

    [LinkedIn] – [https://www.linkedin.com/in/khalil-bouali]

    Project Link: https://github.com/khalil-bouali/springboot-3-micro-service-demo

    Acknowledgements

    Visit original content creator repository https://github.com/khalil-bouali/springboot-3-micro-service-demo
  • springboot-3-micro-service-demo

    Spring boot Micro Services

    Microservices sample project

    alt text

    This repository contains a demo project showcasing a microservices-based application, designed to provide a hands-on understanding of microservices architecture and implementation. The project consists of an API Gateway, Config Server, Discovery Server, and two microservices: Student and School.

    Table of Contents

    Getting Started

    Follow the instructions below to set up the project on your local machine for development and testing purposes.

    Prerequisites

    Ensure you have the following software installed on your system before proceeding:

    • Java Development Kit (JDK) 17 or later
    • Maven
    • Docker (optional, for containerization)

    Installation

    1. Clone the repository:

    git clone git remote add origin git@github.com:khalil-bouali/springboot-3-micro-service-demo.git

    1. Navigate to the project directory:
    2. Build and package each component with Maven:

    Project Components

    API Gateway

    The API Gateway serves as the single entry point for all client requests, managing and routing them to the appropriate microservices.

    Config Server

    The Config Server centralizes configuration management for all microservices, simplifying application maintenance and consistency across environments.

    Discovery Server

    The Discovery Server provides service registration and discovery, enabling seamless service-to-service communication within the microservices ecosystem.

    Student Microservice

    The Student Microservice is responsible for managing student-related data and operations, such as adding, updating, and retrieving student records.

    School Microservice

    The School Microservice manages school-related data and operations, including adding, updating, and retrieving school records.

    Inter-Service Communication

    Using OpenFeign

    This project demonstrates inter-service communication using OpenFeign, a declarative REST client that simplifies service-to-service communication within the microservices ecosystem.

    Distributed Tracing

    Using Zipkin

    The project showcases the use of Zipkin for distributed tracing, enhancing application observability and enabling the visualization and troubleshooting of latency issues.

    Contributing

    Contributions are welcome! Please read our CONTRIBUTING.md for details on how to contribute to this project.

    License

    This project is licensed under the MIT License.

    Contact

    [Khalil Bouali] – [khalil.bouali95@gmail.com]

    [LinkedIn] – [https://www.linkedin.com/in/khalil-bouali]

    Project Link: https://github.com/khalil-bouali/springboot-3-micro-service-demo

    Acknowledgements

    Visit original content creator repository https://github.com/khalil-bouali/springboot-3-micro-service-demo
  • react-spinners-components

    react-spinners-components

    Very easy to use loading spinners for React.

    NPM JavaScript Style Guide

    You can check the available loading spinners on the link below:

    Install

    npm install react-spinners-components
    

    or

    yarn add react-spinners-components
    

    Usage

    There is a total of 15 types of loading spinners: Ball, Blocks, Cube, Discuss, Disk, DualBall, Eater, Gear, Infinity, Interwind, Pulse, Ripple, Rolling, Spinner, Wedges. Please capitalize the first letter when inserting the type prop, e.g., Ball —> ‘Ball’.

    Please notice the following:

    • When the component accepts only one color —> prop is called color and accepts a single string, e.g., ‘red’ or ‘#f91a10’;
    • When the component needs more than one color —> prop is called colors and accepts an array of strings with the colors that it needs (check the examples to know how many colors each type needs);
    • The size prop needs a string. You can use any unit, e.g., px and rem, but if the unit is not stated, px will be applied by default. Examples: ‘150px’, ’10rem’, ‘150’;

    If no props are given

    • None of the props are required. If no props are given, the react-spinners-components will return the LoadingSpinnerComponent with the ‘Ball’ type, default color and size.

    • If props color(s) and / or size are not given, default values will be used for the missing props.

    Examples

    Loading spinner type ‘Ball’

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'Ball' } color={ 'red' } size={ '100px' } />
      );
    };
    
    export default Example;

    Loading spinner type ‘Blocks’

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'Blocks' } colors={ [ '#06628d', '#f91a10' ] } size={ '100px' } />
      );
    };
    
    export default Example;

    Loading spinner type ‘Cube’

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'Cube' } colors={ [ '#06628d', '#f91a10', 'yellow', 'purple' ] } size={ '100px' } />
      );
    };
    
    export default Example;

    Loading spinner type ‘Discuss’

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'Discuss' } color={ '#06628d' } size={ '100px' } />
      );
    };
    
    export default Example;

    Loading spinner type ‘Disk’

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'Disk' } colors={ [ '#06628d', 'purple'] } size={ '100px' } />
      );
    };
    
    export default Example;

    Loading spinner type ‘DualBall’

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'DualBall' } colors={ [ '#06628d', 'purple', '#06628d'] } size={ '200px' } />
      );
    };
    
    export default Example;

    Note: ‘DualBall’ can actually work like a ‘TriBall’ by using 3 different colors, example below:

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'DualBall' } colors={ [ '#06628d', 'purple', 'yellow'] } size={ '200px' } />
      );
    };
    
    export default Example;

    Loading spinner type ‘Eater’

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'Eater' } colors={ [ '#06628d', 'purple'] } size={ '150px' } />
      );
    };
    
    export default Example;

    Loading spinner type ‘Gear’

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'Gear' } color={ 'purple' } size={ '150px' } />
      );
    };
    
    export default Example;

    Loading spinner type ‘Infinity’

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'Infinity' } color={ 'purple' } size={ '150px' } />
      );
    };
    
    export default Example;

    Loading spinner type ‘Interwind’

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'Interwind' } colors={ [ '#06628d', 'purple'] } size={ '125px' } />
      );
    };
    
    export default Example;

    Loading spinner type ‘Pulse’

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'Pulse' } colors={ [ '#06628d', 'purple', 'blue'] } size={ '150px' } />
      );
    };
    
    export default Example;

    Loading spinner type ‘Ripple’

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'Ripple' } colors={ [ '#06628d', 'purple'] } size={ '150px' } />
      );
    };
    
    export default Example;

    Loading spinner type ‘Rolling’

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'Rolling' } color={ 'purple' } size={ '150px' } />
      );
    };
    
    export default Example;

    Loading spinner type ‘Spinner’

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'Spinner' } color={ 'purple' } size={ '150px' } />
      );
    };
    
    export default Example;

    Loading spinner type ‘Wedges’

    import React from 'react';
    import LoadingSpinnerComponent from 'react-spinners-components';
    
    const Example = () => {
      return(
        <LoadingSpinnerComponent type={ 'Wedges' } colors={ [ '#06628d', 'purple', 'blue', 'yellow'] } size={ '300px' } />
      );
    };
    
    export default Example;

    References

    Author

    @kazimkazam

    Repository

    @Github

    @npm

    License

    MIT © kazimkazam

    Visit original content creator repository https://github.com/kazimkazam/react-spinners-components
  • ecs-task

    Tests

    ecs-task

    ecs-task is an opinionated, but flexible tool for deploying to Amazon Web Service’s Elastic Container Service.

    It is built on the following premises:

    • ECS Services, load balancers, auto-scaling, etc. are managed elsewhere, e.g. Terraform, Cloudformation, etc.
    • Deploying to ECS is defined as:
      1. Update task definition with new image tag
      2. [Optional] Running any number of one-off Tasks, e.g. Django database migrations.
      3. [Optional] Updating Services to use the new Task Definition.
      4. [Optional] Updating Cloudwatch Event Targets to use the new Task Definition.
      5. Deregister old Task Definitions.
    • Applications manage their own Task/Container definitions and can deploy themselves to a pre-defined ECS Cluster.
    • The ability to rollback is important and should be as easy as possible.

    Installation

    pip install ecs-task
    

    (Optionally, just copy ecs_task.py to your project and install boto3).

    Usage

    This module is made up of a single class, ecs_task.ECSTask which is designed to be extended in your project. A basic example:

    #!/usr/bin/env python
    from ecs_task import ECSTask
    
    class WebTask(ECSTask):
        task_definition = {
            "family": "web",
            "executionRoleArn": EXECUTION_ROLE_ARN,
            "containerDefinitions": [
                {
                    "name": "web",
                    "image": "my_image:{image_tag}",
                    "portMappings": [{"containerPort": 8080}],
                    "cpu": 1024,
                    "memory": 1024,
                }
            ],
        }
        update_services = [{"service": "web", "cluster": "my_cluster",}]
    
    if __name__ == "__main__":
        WebTask().main()

    You could save this as _ecs/web_dev.py and then execute it with python -m _ecs.web_dev --help

    usage: web_dev.py [-h] {deploy,rollback,debug} ...
    
    ECS Task
    
    positional arguments:
      {deploy,rollback,debug}
        deploy              Register new task definitions using `image_tag`.
                            Update defined ECS Services, Event Targets, and run
                            defined ECS Tasks
        rollback            Deactivate current task definitions and rollback all
                            ECS Services and Event Targets to previous active
                            definition.
        debug               Dump JSON generated for class attributes.
    
    optional arguments:
      -h, --help            show this help message and exit
    

    Class attributes

    A sub-class of ECSTask must include a task_definition to do anything. Any other attributes are optional. The following attributes are designed to be a 1-to-1 mapping to an AWS API endpoint via boto3. The values you provide will be passed as keyword arguments to the associated method with the correct Task Definition inserted. Any attribute that takes a list can make multiple calls to the given API.

    A few additional attributes are available:

    • active_task_count: (int) the number of task definitions to keep active after a deployment. Default is 10.

    • sns_notification_topic_arn: (str) the ARN for an SNS topic which will receive a message whenever an AWS API call is executed. This can be used to trigger notifications or perform additional tasks related to the deployment. The message is in the format:

        {
          "client": client,  # boto3 client (usually "ecs")
          "method": method,  # method called (e.g., "update_service")
          "input": kwargs,   # method input as a dictionary
          "result": result   # results from AWS API
        }
    • notification_method_blacklist_regex (re.Pattern) a pattern of methods to avoid sending notifications for. Default is re.compile(r"^describe_|get_|list_|.*register_task")

    Command Interface

    Each class is intended to be “executable” by calling .main(). Multiple class instances can be called in a given file by using:

    if __name__ == "__main__":
        for klass in [WebTask, WorkerTask]:
            klass().main()

    debug

    Just prints the value of each class attribute to the console. Useful if you’re doing some class inheritance and want to verify what you have before running against AWS.

    deploy

    The deploy subcommand accepts an additional argument, image_tag which is used to update any Container Definitions in the task which have the {image_tag} placeholder. It will:

    1. Register a new Task Definition
    2. Run Tasks (as defined in run_tasks)
    3. Update Services (as defined in update_services)
    4. Update Event Targets (as defined in events__put_targets)
    5. Deregister any active Task Definitions older than active_task_count (by default, 10)

    rollback

    1. Deregister the latest active Task Definition
    2. Update Services (as defined in update_services) with the previous active Task Definition
    3. Update Event Targets (as defined in events__put_targets) with the previous active Task Definition
    Visit original content creator repository https://github.com/lincolnloop/ecs-task
  • ecs-task

    Tests

    ecs-task

    ecs-task is an opinionated, but flexible tool for deploying to Amazon Web Service’s Elastic Container Service.

    It is built on the following premises:

    • ECS Services, load balancers, auto-scaling, etc. are managed elsewhere, e.g. Terraform, Cloudformation, etc.
    • Deploying to ECS is defined as:
      1. Update task definition with new image tag
      2. [Optional] Running any number of one-off Tasks, e.g. Django database migrations.
      3. [Optional] Updating Services to use the new Task Definition.
      4. [Optional] Updating Cloudwatch Event Targets to use the new Task Definition.
      5. Deregister old Task Definitions.
    • Applications manage their own Task/Container definitions and can deploy themselves to a pre-defined ECS Cluster.
    • The ability to rollback is important and should be as easy as possible.

    Installation

    pip install ecs-task
    

    (Optionally, just copy ecs_task.py to your project and install boto3).

    Usage

    This module is made up of a single class, ecs_task.ECSTask which is designed to be extended in your project. A basic example:

    #!/usr/bin/env python
    from ecs_task import ECSTask
    
    class WebTask(ECSTask):
        task_definition = {
            "family": "web",
            "executionRoleArn": EXECUTION_ROLE_ARN,
            "containerDefinitions": [
                {
                    "name": "web",
                    "image": "my_image:{image_tag}",
                    "portMappings": [{"containerPort": 8080}],
                    "cpu": 1024,
                    "memory": 1024,
                }
            ],
        }
        update_services = [{"service": "web", "cluster": "my_cluster",}]
    
    if __name__ == "__main__":
        WebTask().main()

    You could save this as _ecs/web_dev.py and then execute it with python -m _ecs.web_dev --help

    usage: web_dev.py [-h] {deploy,rollback,debug} ...
    
    ECS Task
    
    positional arguments:
      {deploy,rollback,debug}
        deploy              Register new task definitions using `image_tag`.
                            Update defined ECS Services, Event Targets, and run
                            defined ECS Tasks
        rollback            Deactivate current task definitions and rollback all
                            ECS Services and Event Targets to previous active
                            definition.
        debug               Dump JSON generated for class attributes.
    
    optional arguments:
      -h, --help            show this help message and exit
    

    Class attributes

    A sub-class of ECSTask must include a task_definition to do anything. Any other attributes are optional. The following attributes are designed to be a 1-to-1 mapping to an AWS API endpoint via boto3. The values you provide will be passed as keyword arguments to the associated method with the correct Task Definition inserted. Any attribute that takes a list can make multiple calls to the given API.

    A few additional attributes are available:

    • active_task_count: (int) the number of task definitions to keep active after a deployment. Default is 10.

    • sns_notification_topic_arn: (str) the ARN for an SNS topic which will receive a message whenever an AWS API call is executed. This can be used to trigger notifications or perform additional tasks related to the deployment. The message is in the format:

        {
          "client": client,  # boto3 client (usually "ecs")
          "method": method,  # method called (e.g., "update_service")
          "input": kwargs,   # method input as a dictionary
          "result": result   # results from AWS API
        }
    • notification_method_blacklist_regex (re.Pattern) a pattern of methods to avoid sending notifications for. Default is re.compile(r"^describe_|get_|list_|.*register_task")

    Command Interface

    Each class is intended to be “executable” by calling .main(). Multiple class instances can be called in a given file by using:

    if __name__ == "__main__":
        for klass in [WebTask, WorkerTask]:
            klass().main()

    debug

    Just prints the value of each class attribute to the console. Useful if you’re doing some class inheritance and want to verify what you have before running against AWS.

    deploy

    The deploy subcommand accepts an additional argument, image_tag which is used to update any Container Definitions in the task which have the {image_tag} placeholder. It will:

    1. Register a new Task Definition
    2. Run Tasks (as defined in run_tasks)
    3. Update Services (as defined in update_services)
    4. Update Event Targets (as defined in events__put_targets)
    5. Deregister any active Task Definitions older than active_task_count (by default, 10)

    rollback

    1. Deregister the latest active Task Definition
    2. Update Services (as defined in update_services) with the previous active Task Definition
    3. Update Event Targets (as defined in events__put_targets) with the previous active Task Definition
    Visit original content creator repository https://github.com/lincolnloop/ecs-task
  • avro-sql

    Build Status GitHub license

    Avro-Sql

    This is a library allowing to transform the shape of an Avro record using SQL. It relies on Apache Calcite for the SQL parsing.

    import AvroSql._
    val record: GenericRecord = {...}
    record.scql("SELECT name, address.street.name as streetName")

    As simple as that!

    Let’s say we have the following Avro Schema:

    {
      "type": "record",
      "name": "Pizza",
      "namespace": "com.landoop.sql.avro",
      "fields": [
        {
          "name": "ingredients",
          "type": {
            "type": "array",
            "items": {
              "type": "record",
              "name": "Ingredient",
              "fields": [
                {
                  "name": "name",
                  "type": "string"
                },
                {
                  "name": "sugar",
                  "type": "double"
                },
                {
                  "name": "fat",
                  "type": "double"
                }
              ]
            }
          }
        },
        {
          "name": "vegetarian",
          "type": "boolean"
        },
        {
          "name": "vegan",
          "type": "boolean"
        },
        {
          "name": "calories",
          "type": "int"
        },
        {
          "name": "fieldName",
          "type": "string"
        }
      ]
    }

    using the library one can apply to types of queries:

    • to flatten it
    • to retain the structure while cherry-picking and/or rename fields The difference between the two is marked by the withstructure* keyword. If this is missing you will end up flattening the structure.

    Let’s take a look at the flatten first. There are cases when you are receiving a nested avro structure and you want to flatten the structure while being able to cherry pick the fields and rename them. Imagine we have the following Avro schema:

    {
      "type": "record",
      "name": "Person",
      "namespace": "com.landoop.sql.avro",
      "fields": [
        {
          "name": "name",
          "type": "string"
        },
        {
          "name": "address",
          "type": {
            "type": "record",
            "name": "Address",
            "fields": [
              {
                "name": "street",
                "type": {
                  "type": "record",
                  "name": "Street",
                  "fields": [
                    {
                      "name": "name",
                      "type": "string"
                    }
                  ]
                }
              },
              {
                "name": "street2",
                "type": [
                  "null",
                  "Street"
                ]
              },
              {
                "name": "city",
                "type": "string"
              },
              {
                "name": "state",
                "type": "string"
              },
              {
                "name": "zip",
                "type": "string"
              },
              {
                "name": "country",
                "type": "string"
              }
            ]
          }
        }
      ]
    }
    

    Applying this SQL like syntax

    SELECT 
        name, 
        address.street.*, 
        address.street2.name as streetName2 
    FROM topic
    

    the projected new schema is:

    {
      "type": "record",
      "name": "Person",
      "namespace": "com.landoop.sql.avro",
      "fields": [
        {
          "name": "name",
          "type": "string"
        },
        {
          "name": "name_1",
          "type": "string"
        },
        {
          "name": "streetName2",
          "type": "string"
        }
      ]
    }
    

    There are scenarios where you might want to rename fields and maybe reorder them. By applying this SQL like syntax on the Pizza schema

    SELECT 
           name, 
           ingredients.name as fieldName, 
           ingredients.sugar as fieldSugar, 
           ingredients.*, 
           calories as cals 
    withstructure
    

    we end up projecting the first structure into this one:

    {
      "type": "record",
      "name": "Pizza",
      "namespace": "com.landoop.sql.avro",
      "fields": [
        {
          "name": "name",
          "type": "string"
        },
        {
          "name": "ingredients",
          "type": {
            "type": "array",
            "items": {
              "type": "record",
              "name": "Ingredient",
              "fields": [
                {
                  "name": "fieldName",
                  "type": "string"
                },
                {
                  "name": "fieldSugar",
                  "type": "double"
                },
                {
                  "name": "fat",
                  "type": "double"
                }
              ]
            }
          }
        },
        {
          "name": "cals",
          "type": "int"
        }
      ]
    }

    Flatten rules

    • you can’t flatten a schema containing array fields
    • when flattening and the column name has already been used it will get a index appended. For example if field name appears twice and you don’t specifically rename the second instance (name as renamedName) the new schema will end up containing: name and name_1

    How to use it

    import AvroSql._
    val record: GenericRecord = {...}
    record.scql("SELECT name, address.street.name as streetName")

    As simple as that!

    Query Examples

    You can find more examples in the unit tests, however here are a few used:

    • flattening
    //rename and only pick fields on first level
    SELECT calories as C ,vegan as V ,name as fieldName FROM topic
    
    //Cherry pick fields on different levels in the structure
    SELECT name, address.street.name as streetName FROM topic
    
    //Select and rename fields on nested level
    SELECT name, address.street.*, address.street2.name as streetName2 FROM topic
    
    • retaining the structure
    //you can select itself - obviousely no real gain on this
    SELECT * FROM topic withstructure 
    
    //rename a field 
    SELECT *, name as fieldName FROM topic withstructure
    
    //rename a complex field
    SELECT *, ingredients as stuff FROM topic withstructure
    
    //select a single field
    SELECT vegan FROM topic withstructure
    
    //rename and only select nested fields
    SELECT ingredients.name as fieldName, ingredients.sugar as fieldSugar, ingredients.* FROM topic withstructure
    
    
    

    Release Notes

    0.1 (2017-05-03)

    • first release

    Building

    Requires gradle 3.4.1 to build.

    To build

    gradle compile

    To test

    gradle test

    You can also use the gradle wrapper

    ./gradlew build
    

    To view dependency trees

    gradle dependencies # 
    
    Visit original content creator repository https://github.com/lensesio/avro-sql
  • avro-sql

    Build Status GitHub license

    Avro-Sql

    This is a library allowing to transform the shape of an Avro record using SQL. It relies on Apache Calcite for the SQL parsing.

    import AvroSql._
    val record: GenericRecord = {...}
    record.scql("SELECT name, address.street.name as streetName")

    As simple as that!

    Let’s say we have the following Avro Schema:

    {
      "type": "record",
      "name": "Pizza",
      "namespace": "com.landoop.sql.avro",
      "fields": [
        {
          "name": "ingredients",
          "type": {
            "type": "array",
            "items": {
              "type": "record",
              "name": "Ingredient",
              "fields": [
                {
                  "name": "name",
                  "type": "string"
                },
                {
                  "name": "sugar",
                  "type": "double"
                },
                {
                  "name": "fat",
                  "type": "double"
                }
              ]
            }
          }
        },
        {
          "name": "vegetarian",
          "type": "boolean"
        },
        {
          "name": "vegan",
          "type": "boolean"
        },
        {
          "name": "calories",
          "type": "int"
        },
        {
          "name": "fieldName",
          "type": "string"
        }
      ]
    }

    using the library one can apply to types of queries:

    • to flatten it
    • to retain the structure while cherry-picking and/or rename fields The difference between the two is marked by the withstructure* keyword. If this is missing you will end up flattening the structure.

    Let’s take a look at the flatten first. There are cases when you are receiving a nested avro structure and you want to flatten the structure while being able to cherry pick the fields and rename them. Imagine we have the following Avro schema:

    {
      "type": "record",
      "name": "Person",
      "namespace": "com.landoop.sql.avro",
      "fields": [
        {
          "name": "name",
          "type": "string"
        },
        {
          "name": "address",
          "type": {
            "type": "record",
            "name": "Address",
            "fields": [
              {
                "name": "street",
                "type": {
                  "type": "record",
                  "name": "Street",
                  "fields": [
                    {
                      "name": "name",
                      "type": "string"
                    }
                  ]
                }
              },
              {
                "name": "street2",
                "type": [
                  "null",
                  "Street"
                ]
              },
              {
                "name": "city",
                "type": "string"
              },
              {
                "name": "state",
                "type": "string"
              },
              {
                "name": "zip",
                "type": "string"
              },
              {
                "name": "country",
                "type": "string"
              }
            ]
          }
        }
      ]
    }
    

    Applying this SQL like syntax

    SELECT 
        name, 
        address.street.*, 
        address.street2.name as streetName2 
    FROM topic
    

    the projected new schema is:

    {
      "type": "record",
      "name": "Person",
      "namespace": "com.landoop.sql.avro",
      "fields": [
        {
          "name": "name",
          "type": "string"
        },
        {
          "name": "name_1",
          "type": "string"
        },
        {
          "name": "streetName2",
          "type": "string"
        }
      ]
    }
    

    There are scenarios where you might want to rename fields and maybe reorder them. By applying this SQL like syntax on the Pizza schema

    SELECT 
           name, 
           ingredients.name as fieldName, 
           ingredients.sugar as fieldSugar, 
           ingredients.*, 
           calories as cals 
    withstructure
    

    we end up projecting the first structure into this one:

    {
      "type": "record",
      "name": "Pizza",
      "namespace": "com.landoop.sql.avro",
      "fields": [
        {
          "name": "name",
          "type": "string"
        },
        {
          "name": "ingredients",
          "type": {
            "type": "array",
            "items": {
              "type": "record",
              "name": "Ingredient",
              "fields": [
                {
                  "name": "fieldName",
                  "type": "string"
                },
                {
                  "name": "fieldSugar",
                  "type": "double"
                },
                {
                  "name": "fat",
                  "type": "double"
                }
              ]
            }
          }
        },
        {
          "name": "cals",
          "type": "int"
        }
      ]
    }

    Flatten rules

    • you can’t flatten a schema containing array fields
    • when flattening and the column name has already been used it will get a index appended. For example if field name appears twice and you don’t specifically rename the second instance (name as renamedName) the new schema will end up containing: name and name_1

    How to use it

    import AvroSql._
    val record: GenericRecord = {...}
    record.scql("SELECT name, address.street.name as streetName")

    As simple as that!

    Query Examples

    You can find more examples in the unit tests, however here are a few used:

    • flattening
    //rename and only pick fields on first level
    SELECT calories as C ,vegan as V ,name as fieldName FROM topic
    
    //Cherry pick fields on different levels in the structure
    SELECT name, address.street.name as streetName FROM topic
    
    //Select and rename fields on nested level
    SELECT name, address.street.*, address.street2.name as streetName2 FROM topic
    
    • retaining the structure
    //you can select itself - obviousely no real gain on this
    SELECT * FROM topic withstructure 
    
    //rename a field 
    SELECT *, name as fieldName FROM topic withstructure
    
    //rename a complex field
    SELECT *, ingredients as stuff FROM topic withstructure
    
    //select a single field
    SELECT vegan FROM topic withstructure
    
    //rename and only select nested fields
    SELECT ingredients.name as fieldName, ingredients.sugar as fieldSugar, ingredients.* FROM topic withstructure
    
    
    

    Release Notes

    0.1 (2017-05-03)

    • first release

    Building

    Requires gradle 3.4.1 to build.

    To build

    gradle compile

    To test

    gradle test

    You can also use the gradle wrapper

    ./gradlew build
    

    To view dependency trees

    gradle dependencies # 
    
    Visit original content creator repository https://github.com/lensesio/avro-sql
  • mbit-m08-dc02-nlp

    EJERCICIO NLP (NATURAL LANGUAGE PROCESSING)

    Carlos Alfonsel (carlos.alfonsel@mbitschool.com)

    1. Análisis Exploratorio del Dataset (EDA)

    • Importación de Librerías y Conjunto de Datos.
    • Estudio y representación gráfica de las 8 clases: análisis del balanceo de clases.

    2. Limpieza del Texto

    Se programa la función clean_text() que elimina los números y los signos de puntuación, y convierte todas las palabras a minúsculas:


    pattern = re.compile(‘[{}]’.format(re.escape(string.punctuation)))

    def clean_text(doc):
    doc = re.sub(r’\d+’, ”, doc)
    tokens = nlp(doc)
    tokens = [tok.lower_ for tok in tokens if not tok.is_punct and not tok.is_space]
    filtered_tokens = [pattern.sub(”, token) for token in tokens]
    filtered_text = ‘ ‘.join(filtered_tokens)
    return filtered_text


    3. Definición de Funciones Auxiliares

    Se definen las funciones bow_extractor() y tfidf_extractor para calcular el corpus del texto que se pasa como parámetro:


    def bow_extractor(corpus, ngram_range = (1,1), min_df = 1, max_df = 1.0):
    vectorizer = CountVectorizer(min_df = 1, max_df = 0.95)
    features = vectorizer.fit_transform(corpus)
    return vectorizer, features

    def tfidf_extractor(corpus, ngram_range = (1,1), min_df = 1, max_df = 1.0):
    vectorizer = TfidfVectorizer(min_df = 1, max_df = 0.95)
    features = vectorizer.fit_transform(corpus)
    return vectorizer, features


    4. División del Dataset para Entrenamiento y Validación

    X_train, X_test, y_train, y_test = train_test_split(datos[‘Observaciones’], datos[‘Tipología’], test_size = 0.3, random_state = 0)

    5. Algoritmos de Clasificación

    En este apartado aplicamos los siguientes modelos a nuestros datos: Logistic Regression, Multinomial Naive-Bayes y Linear SVM, con los siguientes resultados en términos de precisión (accuracy):

    Usando características BoW (Bag-of-Words):
    LGR: 0.61
    MNB: 0.58
    SVM: 0.56
    Usando características TF-IDF:

    LGR: 0.55
    MNB: 0.47
    SVM: 0.64

    Optimizando el Modelo Linear SVM con características TF-IDF conseguimos un 0.70 de accuracy.

    6. MEJORAS DE LOS CLASIFICADORES

    En este apartado se plantean varias alternativas para ver si se mejoran los resultados del clasificador:

    6.1. LEMMATIZADO

    Se define la función lemmatize_text() para extraer las raíces de las palabras:


    def lemmatize_text(text):
    tokens = nlp(text)
    lemmatized_tokens = [tok.lemma_ for tok in tokens]
    lemmatized_text = ‘ ‘.join(lemmatized_tokens)

    return lemmatized_text
    

    6.2. NUEVOS CLASIFICADORES

    Definimos tres nuevos clasificadores: Árboles de Decisión, Random Forest y K-Nearest Neighbors, con estos resultados, una vez realizado el lemmatizado del texto:

    Usando características BoW (Bag-of-Words) y lemmatizado:
    CART: 0.58
    RF : 0.67
    KNN : 0.39

    Usando características TF-IDF y lemmatizado:
    CART: 0.56
    RF : 0.64
    KNN : 0.61

    Optimizando el Modelo Decision Tree Classifier (CART) con características TF-IDF conseguimos un 0.65 de accuracy.

    6.3. REDUCCIÓN DE DIMENSIONALIDAD LSA (Latent Semantic Analysis)

    Por último, vamos a probar con una de las técnicas de reducción de dimensionalidad, y analizamos los resultados. Definimos la función lsa_extractor, que genera un modelo Latent Semantic Analysis sobre un corpus de texto y utilizando 100 dimensiones:


    def lsa_extractor(corpus, n_dim = 100):
    tfidf = TfidfVectorizer(use_idf = True)
    svd = TruncatedSVD(n_dim)
    vectorizer = make_pipeline(tfidf, svd, Normalizer(copy = False))
    features = vectorizer.fit_transform(corpus)
    return vectorizer, features


    A continuación, aplicamos los siguientes modelos sobre nuestros datos lemmatizados y habiendo aplicado al texto una reducción LSA de 100 dimensiones: Logistic Regression, Random Forest, K-Nearest Neighbors y Linear SVM.

    Usando características TF-IDF, lemmatizado y reducción de dimensionalidad LSA-100:
    LGR: 0.68
    RF : 0.55
    KNN: 0.61
    SVM: 0.64

    6.4. MODELO CON WORD EMBEDDINGS

    Para finalizar este apartado de mejoras, se aplica un modelo con Word Embeddings promediados sobre los siguientes clasificadores:

    LGR : 0.45
    CART: 0.24
    RF : 0.30
    KNN : 0.30
    SVM : 0.39

    CONCLUSIONES:

    • EL LEMMATIZADO DE LA VARIABLE TARGET MEJORA LOS RESULTADOS.
    • APLICAR UNA REDUCCIÓN DE DIMENSIONALIDAD LSA (Latent Semantic Analysis) MEJORA SIGNIFICATIVAMENTE LOS RESULTADOS.
    • LOS MODELOS CON WORD EMBEDDING PROMEDIADO FUNCIONAN PEOR QUE LOS MODELOS MÁS SIMPLES (BoW, TF-IDF) DEBIDO A QUE NUESTRO CONJUNTO DE DATOS ES MUY PEQUEÑO.
    • MEJOR ALGORITMO ENCONTRADO: MODELO DE REGRESIÓN LOGÍSTICA, CON CARACTERÍSTICAS TF-IDF, CON DATASET LEMMATIZADO Y REDUCCIÓN LSA DE 100 DIMENSIONES.

    Visit original content creator repository
    https://github.com/lesnofla/mbit-m08-dc02-nlp