keras
/

sam_large_sa1b

KerasHub

Model card Files Files and versions Community

Divyasreepat commited on 8 days ago

Commit

815453f

•

1 Parent(s): 6eb610d

Update README.md with new model card content

Browse files

Files changed (1) hide show

README.md +173 -9

README.md CHANGED Viewed

@@ -1,14 +1,178 @@
 ---
 library_name: keras-hub
 ---
-This is a [`SAM` model](https://keras.io/api/keras_hub/models/sam) uploaded using the KerasHub library and can be used with JAX, TensorFlow, and PyTorch backends.
-This model is related to a `ImageSegmenter` task.
-Model config:
-* **name:** sam_backbone
-* **trainable:** True
-* **image_encoder:** {'module': 'keras_hub.src.models.vit_det.vit_det_backbone', 'class_name': 'ViTDetBackbone', 'config': {'name': 'vi_t_det_backbone', 'trainable': True, 'image_shape': [1024, 1024, 3], 'patch_size': 16, 'hidden_size': 1024, 'num_layers': 24, 'intermediate_dim': 4096, 'num_heads': 16, 'num_output_channels': 256, 'use_bias': True, 'use_abs_pos': True, 'use_rel_pos': True, 'window_size': 14, 'global_attention_layer_indices': [5, 11, 17, 23], 'layer_norm_epsilon': 1e-06}, 'registered_name': 'keras_hub>ViTDetBackbone'}
-* **prompt_encoder:** {'module': 'keras_hub.src.models.sam.sam_prompt_encoder', 'class_name': 'SAMPromptEncoder', 'config': {'name': 'sam_prompt_encoder', 'trainable': True, 'dtype': {'module': 'keras', 'class_name': 'DTypePolicy', 'config': {'name': 'float32'}, 'registered_name': None}, 'hidden_size': 256, 'image_embedding_size': [64, 64], 'input_image_size': [1024, 1024], 'mask_in_channels': 16, 'activation': 'gelu'}, 'registered_name': 'keras_hub>SAMPromptEncoder'}
-* **mask_decoder:** {'module': 'keras_hub.src.models.sam.sam_mask_decoder', 'class_name': 'SAMMaskDecoder', 'config': {'name': 'sam_mask_decoder', 'trainable': True, 'dtype': {'module': 'keras', 'class_name': 'DTypePolicy', 'config': {'name': 'float32'}, 'registered_name': None}, 'hidden_size': 256, 'num_layers': 2, 'intermediate_dim': 2048, 'num_heads': 8, 'embedding_dim': 256, 'num_multimask_outputs': 3, 'iou_head_depth': 3, 'iou_head_hidden_dim': 256, 'activation': 'gelu'}, 'registered_name': 'keras_hub>SAMMaskDecoder'}
-This model card has been generated automatically and should be completed by the model author. See [Model Cards documentation](https://huggingface.co/docs/hub/model-cards) for more information.

 ---
 library_name: keras-hub
 ---
+### Model Overview
+The Segment Anything Model (SAM) produces high quality object masks from input prompts such as points or boxes, and it can be used to generate masks for all objects in an image. It has been trained on a dataset of 11 million images and 1.1 billion masks, and has strong zero-shot performance on a variety of segmentation tasks. This model is supported in both KerasCV and KerasHub. KerasCV will no longer be actively developed, so please try to use KerasHub.
+## Links
+* [Segment Anything Quickstart Notebook: coming soon]()
+* [Segment Anything API Documentation](https://keras.io/api/keras_hub/models/sam/)
+* [Segment Anything Model Card](https://github.com/facebookresearch/segment-anything)
+* [Segment Anything paper](https://arxiv.org/abs/2304.02643)
+## Installation
+Keras and KerasHub can be installed with:
+```
+pip install -U -q keras-Hub
+pip install -U -q keras&gt;=3
+```
+Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the [Keras Getting Started](https://keras.io/getting_started/) page.
+## Presets
+The following model checkpoints are provided by the Keras team. Weights have been ported from https://dl.fbaipublicfiles.com/segment_anything/. Full code examples for each are available below.
+| Preset name    | Parameters | Description                                      |
+|----------------|------------|--------------------------------------------------|
+| sam_base_sa1b  | 93.74M     | The base SAM model trained on the SA1B dataset.  |
+| sam_large_sa1b | 312.34M    | The large SAM model trained on the SA1B dataset. |
+| sam_huge_sa1b  | 641.09M    | The huge SAM model trained on the SA1B dataset.  |
+### Example Usage
+Load pretrained model using `from_preset`.
+```python
+image_size=1024
+batch_size=2
+input_data = {
+    "images": np.ones(
+        (batch_size, image_size, image_size, 3),
+        dtype="float32",
+    ),
+    "points": np.ones((batch_size, 1, 2), dtype="float32"),
+    "labels": np.ones((batch_size, 1), dtype="float32"),
+    "boxes": np.ones((batch_size, 1, 2, 2), dtype="float32"),
+    "masks": np.zeros(
+        (batch_size, 0, image_size, image_size, 1)
+    ),
+}
+sam = keras_hub.models.SAMImageSegmenter.from_preset('sam_base_sa1b')
+outputs = sam.predict(input_data)
+masks, iou_pred = outputs["masks"], outputs["iou_pred"]
+```
+Load segment anything image segmenter with custom backbone
+```python
+image_size = 128
+batch_size = 2
+images = np.ones(
+    (batch_size, image_size, image_size, 3),
+    dtype="float32",
+)
+image_encoder = keras_hub.models.ViTDetBackbone(
+    hidden_size=16,
+    num_layers=16,
+    intermediate_dim=16 * 4,
+    num_heads=16,
+    global_attention_layer_indices=[2, 5, 8, 11],
+    patch_size=16,
+    num_output_channels=8,
+    window_size=2,
+    image_shape=(image_size, image_size, 3),
+)
+prompt_encoder = keras_hub.layers.SAMPromptEncoder(
+    hidden_size=8,
+    image_embedding_size=(8, 8),
+    input_image_size=(
+        image_size,
+        image_size,
+    ),
+    mask_in_channels=16,
+)
+mask_decoder = keras_hub.layers.SAMMaskDecoder(
+    num_layers=2,
+    hidden_size=8,
+    intermediate_dim=32,
+    num_heads=8,
+    embedding_dim=8,
+    num_multimask_outputs=3,
+    iou_head_depth=3,
+    iou_head_hidden_dim=8,
+)
+backbone = keras_hub.models.SAMBackbone(
+    image_encoder=image_encoder,
+    prompt_encoder=prompt_encoder,
+    mask_decoder=mask_decoder,
+)
+sam = keras_hub.models.SAMImageSegmenter(
+    backbone=backbone
+)
+```
+## Example Usage with Hugging Face URI
+Load pretrained model using `from_preset`.
+```python
+image_size=1024
+batch_size=2
+input_data = {
+    "images": np.ones(
+        (batch_size, image_size, image_size, 3),
+        dtype="float32",
+    ),
+    "points": np.ones((batch_size, 1, 2), dtype="float32"),
+    "labels": np.ones((batch_size, 1), dtype="float32"),
+    "boxes": np.ones((batch_size, 1, 2, 2), dtype="float32"),
+    "masks": np.zeros(
+        (batch_size, 0, image_size, image_size, 1)
+    ),
+}
+sam = keras_hub.models.SAMImageSegmenter.from_preset('sam_base_sa1b')
+outputs = sam.predict(input_data)
+masks, iou_pred = outputs["masks"], outputs["iou_pred"]
+```
+Load segment anything image segmenter with custom backbone
+```python
+image_size = 128
+batch_size = 2
+images = np.ones(
+    (batch_size, image_size, image_size, 3),
+    dtype="float32",
+)
+image_encoder = keras_hub.models.ViTDetBackbone(
+    hidden_size=16,
+    num_layers=16,
+    intermediate_dim=16 * 4,
+    num_heads=16,
+    global_attention_layer_indices=[2, 5, 8, 11],
+    patch_size=16,
+    num_output_channels=8,
+    window_size=2,
+    image_shape=(image_size, image_size, 3),
+)
+prompt_encoder = keras_hub.layers.SAMPromptEncoder(
+    hidden_size=8,
+    image_embedding_size=(8, 8),
+    input_image_size=(
+        image_size,
+        image_size,
+    ),
+    mask_in_channels=16,
+)
+mask_decoder = keras_hub.layers.SAMMaskDecoder(
+    num_layers=2,
+    hidden_size=8,
+    intermediate_dim=32,
+    num_heads=8,
+    embedding_dim=8,
+    num_multimask_outputs=3,
+    iou_head_depth=3,
+    iou_head_hidden_dim=8,
+)
+backbone = keras_hub.models.SAMBackbone(
+    image_encoder=image_encoder,
+    prompt_encoder=prompt_encoder,
+    mask_decoder=mask_decoder,
+)
+sam = keras_hub.models.SAMImageSegmenter(
+    backbone=backbone
+)
+```