SemanticFPN / README.md
zhengrongzhang's picture
Update README.md (#2)
1b18724 verified
|
raw
history blame
4.91 kB
metadata
license: apache-2.0
tags:
  - RyzenAI
  - Image Segmentation
  - Pytorch
  - Vision
datasets:
  - cityscape
language:
  - en
Metircs:
  - mIoU

SemanticFPN model trained on cityscapes

SemanticFPN is a conceptually simple yet effective baseline for panoptic segmentation trained on cityscapes. The method starts with Mask R-CNN with FPN and adds to it a lightweight semantic segmentation branch for dense-pixel prediction. It was introduced in the paper Panoptic Feature Pyramid Networks in 2019 by Kirillov, Alexander, et al.

We develop a modified version that could be supported by AMD Ryzen AI.

Model description

SemanticFPN is a single network that unifies the tasks of instance segmentation and semantic segmentation. The network is designed by endowing Mask R-CNN, a popular instance segmentation method, with a semantic segmentation branch using a shared Feature Pyramid Network (FPN) backbone. This simple baseline not only remains effective for instance segmentation, but also yields a lightweight, top-performing method for semantic segmentation. It is a robust and accurate baseline for both tasks and can serve as a strong baseline for future research in panoptic segmentation.

Intended uses & limitations

You can use the raw model for image segmentation. See the model hub to look for all available SemanticFPN models.

How to use

Installation

Follow Ryzen AI Installation to prepare the environment for Ryzen AI. Run the following script to install pre-requisites for this model.

pip install -r requirements.txt 

Data Preparation (optional: for accuracy evaluation)

  1. Download cityscapes dataset (https://www.cityscapes-dataset.com/downloads)
    • grundtruth folder: gtFine_trainvaltest.zip [241MB]
    • image folder: leftImg8bit_trainvaltest.zip [11GB]
  2. Organize the dataset directory as follows:
└── data
     └── cityscapes
          β”œβ”€β”€ leftImg8bit
          |    β”œβ”€β”€ train
          |    └── val
          └── gtFine
               β”œβ”€β”€ train
               └── val

Test & Evaluation

    parser = argparse.ArgumentParser(description='SemanticFPN model')
    parser.add_argument('--onnx_path', type=str, default='FPN_int_NHWC.onnx')
    parser.add_argument('--save_path', type=str, default='./data/demo_results/senmatic_results.png')
    parser.add_argument('--input_path', type=str, default='data/cityscapes/cityscapes/leftImg8bit/test/bonn/bonn_000000_000019_leftImg8bit.png')
    parser.add_argument('--ipu', action='store_true',
                    help='use ipu')
    parser.add_argument('--provider_config', type=str, default=None,
                    help='provider config path')
    args = parser.parse_args()

    if args.ipu:
        providers = ["VitisAIExecutionProvider"]
        provider_options = [{"config_file": args.provider_config}]
    else:
        providers = ['CPUExecutionProvider']
        provider_options = None

    onnx_path = args.onnx_path
    input_img = build_img(args)
    session = onnxruntime.InferenceSession(onnx_path, providers=providers, provider_options=provider_options)
    ort_input = {session.get_inputs()[0].name: input_img.cpu().numpy()}
    ort_output = session.run(None, ort_input)[0]
    if isinstance(ort_output, (tuple, list)):
        ort_output = ort_output[0]

    output = ort_output[0].transpose(1, 2, 0)
    seg_pred = np.asarray(np.argmax(output, axis=2), dtype=np.uint8)
    color_mask = colorize_mask(seg_pred)
    color_mask.save(args.save_path)
  • Run inference for a single image
python infer_onnx.py --onnx_path FPN_int_NHWC.onnx --input_path /Path/To/Your/Image --ipu --provider_config Path/To/vaip_config.json
  • Test accuracy of the quantized model
python test_onnx.py --onnx_path FPN_int_NHWC.onnx --dataset citys --test-folder ./data/cityscapes --crop-size 256 --ipu --provider_config Path/To/vaip_config.json

Performance

model input size FLOPs mIoU on Cityscapes Validation
SemanticFPN(ResNet18) 256x512 10G 62.9%
model input size FLOPs INT8 mIoU on Cityscapes Validation
SemanticFPN(ResNet18) 256x512 10G 62.5%
@inproceedings{kirillov2019panoptic,
  title={Panoptic feature pyramid networks},
  author={Kirillov, Alexander and Girshick, Ross and He, Kaiming and Doll{\'a}r, Piotr},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  pages={6399--6408},
  year={2019}
}